From patchwork Fri Jul 26 08:45:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060565 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D04EF912 for ; Fri, 26 Jul 2019 08:48:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD4AF205FC for ; Fri, 26 Jul 2019 08:48:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B129D28A3D; Fri, 26 Jul 2019 08:48:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4F15E205FC for ; Fri, 26 Jul 2019 08:48:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AFA986E8C0; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BC9D6E8B5 for ; Fri, 26 Jul 2019 08:47:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621074-1500050 for multiple; Fri, 26 Jul 2019 09:46:14 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:47 +0100 Message-Id: <20190726084613.22129-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/27] drm/i915/gt: Add to timeline requires the timeline mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Modifying a remote context requires careful serialisation with requests on that context, and that serialisation requires us to take their timeline->mutex. Make it so. Note that while struct_mutex rules, we can't create more than one request in parallel, but that age is soon coming to an end. v2: Though it doesn't affect the current users, contexts may share timelines so check if we already hold the right mutex. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_context.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 9292b6ca5e9c..d64b45f7ec6d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -254,10 +254,18 @@ int intel_context_prepare_remote_request(struct intel_context *ce, /* Only suitable for use in remotely modifying this context */ GEM_BUG_ON(rq->hw_context == ce); - /* Queue this switch after all other activity by this context. */ - err = i915_active_request_set(&tl->last_request, rq); - if (err) - return err; + if (rq->timeline != tl) { /* beware timeline sharing */ + err = mutex_lock_interruptible_nested(&tl->mutex, + SINGLE_DEPTH_NESTING); + if (err) + return err; + + /* Queue this switch after current activity by this context. */ + err = i915_active_request_set(&tl->last_request, rq); + if (err) + goto unlock; + } + lockdep_assert_held(&tl->mutex); /* * Guarantee context image and the timeline remains pinned until the @@ -267,7 +275,12 @@ int intel_context_prepare_remote_request(struct intel_context *ce, * words transfer the pinned ce object to tracked active request. */ GEM_BUG_ON(i915_active_is_idle(&ce->active)); - return i915_active_ref(&ce->active, rq->fence.context, rq); + err = i915_active_ref(&ce->active, rq->fence.context, rq); + +unlock: + if (rq->timeline != tl) + mutex_unlock(&tl->mutex); + return err; } struct i915_request *intel_context_create_request(struct intel_context *ce) From patchwork Fri Jul 26 08:45:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060571 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4866D13B1 for ; Fri, 26 Jul 2019 08:48:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 38EBC205FC for ; Fri, 26 Jul 2019 08:48:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2D54328A3B; Fri, 26 Jul 2019 08:48:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 402E928894 for ; Fri, 26 Jul 2019 08:48:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 118486E8C1; Fri, 26 Jul 2019 08:48:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5B6E66E8B9 for ; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621075-1500050 for multiple; Fri, 26 Jul 2019 09:46:14 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:48 +0100 Message-Id: <20190726084613.22129-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/27] drm/i915: Unshare the idle-barrier from other kernel requests X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Under some circumstances (see intel_context_prepare_remote_request), we may use a request along a kernel context to modify the logical state of another. To keep the target context in place while the request executes, we take an active reference on it using the kernel timeline. This is the same timeline as we use for the idle-barrier, and so we end up reusing the same active node. Except that the idle barrier is special and cannot be reused in this manner! Give the idle-barrier a reserved timeline index (0) so that it will always be unique (give or take we may issue multiple idle barriers across multiple engines). Reported-by: Lionel Landwerlin Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel context switch") Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly") Signed-off-by: Chris Wilson Cc: Lionel Landwerlin Cc: Tvrtko Ursulin Tested-by: Lionel Landwerlin --- drivers/gpu/drm/i915/gt/intel_context.c | 40 ++- drivers/gpu/drm/i915/gt/intel_context.h | 13 +- drivers/gpu/drm/i915/gt/selftest_context.c | 310 ++++++++++++++++++ drivers/gpu/drm/i915/i915_active.c | 69 +++- .../drm/i915/selftests/i915_live_selftests.h | 3 +- 5 files changed, 398 insertions(+), 37 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/selftest_context.c diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index d64b45f7ec6d..211ac6568a5d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -162,23 +162,41 @@ static int __intel_context_active(struct i915_active *active) if (err) goto err_ring; + return 0; + +err_ring: + intel_ring_unpin(ce->ring); +err_put: + intel_context_put(ce); + return err; +} + +int intel_context_active_acquire(struct intel_context *ce) +{ + int err; + + err = i915_active_acquire(&ce->active); + if (err) + return err; + /* Preallocate tracking nodes */ if (!i915_gem_context_is_kernel(ce->gem_context)) { err = i915_active_acquire_preallocate_barrier(&ce->active, ce->engine); - if (err) - goto err_state; + if (err) { + i915_active_release(&ce->active); + return err; + } } return 0; +} -err_state: - __context_unpin_state(ce->state); -err_ring: - intel_ring_unpin(ce->ring); -err_put: - intel_context_put(ce); - return err; +void intel_context_active_release(struct intel_context *ce) +{ + /* Nodes preallocated in intel_context_active() */ + i915_active_acquire_barrier(&ce->active); + i915_active_release(&ce->active); } void @@ -297,3 +315,7 @@ struct i915_request *intel_context_create_request(struct intel_context *ce) return rq; } + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftest_context.c" +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 23c7e4c0ce7c..07f9924de48f 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -104,17 +104,8 @@ static inline void intel_context_exit(struct intel_context *ce) ce->ops->exit(ce); } -static inline int intel_context_active_acquire(struct intel_context *ce) -{ - return i915_active_acquire(&ce->active); -} - -static inline void intel_context_active_release(struct intel_context *ce) -{ - /* Nodes preallocated in intel_context_active() */ - i915_active_acquire_barrier(&ce->active); - i915_active_release(&ce->active); -} +int intel_context_active_acquire(struct intel_context *ce); +void intel_context_active_release(struct intel_context *ce); static inline struct intel_context *intel_context_get(struct intel_context *ce) { diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c new file mode 100644 index 000000000000..e3b45fe747ae --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -0,0 +1,310 @@ +/* + * SPDX-License-Identifier: GPL-2.0 + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_selftest.h" +#include "intel_gt.h" + +#include "gem/selftests/mock_context.h" +#include "selftests/igt_flush_test.h" +#include "selftests/mock_drm.h" + +static int request_sync(struct i915_request *rq) +{ + long timeout; + int err = 0; + + i915_request_get(rq); + + i915_request_add(rq); + timeout = i915_request_wait(rq, 0, HZ / 10); + if (timeout < 0) + err = timeout; + else + i915_request_retire_upto(rq); + + i915_request_put(rq); + + return err; +} + +static int context_sync(struct intel_context *ce) +{ + struct intel_timeline *tl = ce->ring->timeline; + int err = 0; + + do { + struct i915_request *rq; + long timeout; + + rcu_read_lock(); + rq = rcu_dereference(tl->last_request.request); + if (rq) + rq = i915_request_get_rcu(rq); + rcu_read_unlock(); + if (!rq) + break; + + timeout = i915_request_wait(rq, 0, HZ / 10); + if (timeout < 0) + err = timeout; + else + i915_request_retire_upto(rq); + + i915_request_put(rq); + } while (!err); + + return err; +} + +static int __live_active_context(struct intel_engine_cs *engine, + struct i915_gem_context *fixme) +{ + struct intel_context *ce; + int pass; + int err; + + /* + * We keep active contexts alive until after a subsequent context + * switch as the final write from the context-save will be after + * we retire the final request. We track when we unpin the context, + * under the presumption that the final pin is from the last request, + * and instead of immediately unpinning the context, we add a task + * to unpin the context from the next idle-barrier. + * + * This test makes sure that the context is kept alive until a + * subsequent idle-barrier (emitted when the engine wakeref hits 0 + * with no more outstanding requests). + */ + + if (intel_engine_pm_is_awake(engine)) { + pr_err("%s is awake before starting %s!\n", + engine->name, __func__); + return -EINVAL; + } + + ce = intel_context_create(fixme, engine); + if (!ce) + return -ENOMEM; + + for (pass = 0; pass <= 2; pass++) { + struct i915_request *rq; + + rq = intel_context_create_request(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err; + } + + err = request_sync(rq); + if (err) + goto err; + + /* Context will be kept active until after an idle-barrier. */ + if (i915_active_is_idle(&ce->active)) { + pr_err("context is not active; expected idle-barrier (%s pass %d)\n", + engine->name, pass); + err = -EINVAL; + goto err; + } + + if (!intel_engine_pm_is_awake(engine)) { + pr_err("%s is asleep before idle-barrier\n", + engine->name); + err = -EINVAL; + goto err; + } + } + + /* Now make sure our idle-barriers are flushed */ + err = context_sync(engine->kernel_context); + if (err) + goto err; + + if (!i915_active_is_idle(&ce->active)) { + pr_err("context is still active!"); + err = -EINVAL; + } + + if (intel_engine_pm_is_awake(engine)) { + struct drm_printer p = drm_debug_printer(__func__); + + intel_engine_dump(engine, &p, + "%s is still awake after idle-barriers\n", + engine->name); + GEM_TRACE_DUMP(); + + err = -EINVAL; + goto err; + } + +err: + intel_context_put(ce); + return err; +} + +static int live_active_context(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + struct i915_gem_context *fixme; + enum intel_engine_id id; + struct drm_file *file; + int err = 0; + + file = mock_file(gt->i915); + if (IS_ERR(file)) + return PTR_ERR(file); + + mutex_lock(>->i915->drm.struct_mutex); + + fixme = live_context(gt->i915, file); + if (!fixme) { + err = -ENOMEM; + goto unlock; + } + + for_each_engine(engine, gt->i915, id) { + err = __live_active_context(engine, fixme); + if (err) + break; + + err = igt_flush_test(gt->i915, I915_WAIT_LOCKED); + if (err) + break; + } + +unlock: + mutex_unlock(>->i915->drm.struct_mutex); + mock_file_free(gt->i915, file); + return err; +} + +static int __remote_sync(struct intel_context *ce, struct intel_context *remote) +{ + struct i915_request *rq; + int err; + + err = intel_context_pin(remote); + if (err) + return err; + + rq = intel_context_create_request(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto unpin; + } + + err = intel_context_prepare_remote_request(remote, rq); + if (err) { + i915_request_add(rq); + goto unpin; + } + + err = request_sync(rq); + +unpin: + intel_context_unpin(remote); + return err; +} + +static int __live_remote_context(struct intel_engine_cs *engine, + struct i915_gem_context *fixme) +{ + struct intel_context *local, *remote; + int pass; + int err; + + /* + * Check that our idle barriers do not interfere with normal + * activity tracking. In particular, check that operating + * on the context image remotely (intel_context_prepare_remote_request) + * which inserts foriegn fences into intel_context.active are not + * clobbered. + */ + + remote = intel_context_create(fixme, engine); + if (!remote) + return -ENOMEM; + + local = intel_context_create(fixme, engine); + if (!local) { + err = -ENOMEM; + goto err_remote; + } + + for (pass = 0; pass <= 2; pass++) { + err = __remote_sync(local, remote); + if (err) + break; + + err = __remote_sync(engine->kernel_context, remote); + if (err) + break; + + if (i915_active_is_idle(&remote->active)) { + pr_err("remote context is not active; expected idle-barrier (%s pass %d)\n", + engine->name, pass); + err = -EINVAL; + break; + } + } + + intel_context_put(local); +err_remote: + intel_context_put(remote); + return err; +} + +static int live_remote_context(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + struct i915_gem_context *fixme; + enum intel_engine_id id; + struct drm_file *file; + int err = 0; + + file = mock_file(gt->i915); + if (IS_ERR(file)) + return PTR_ERR(file); + + mutex_lock(>->i915->drm.struct_mutex); + + fixme = live_context(gt->i915, file); + if (!fixme) { + err = -ENOMEM; + goto unlock; + } + + for_each_engine(engine, gt->i915, id) { + err = __live_remote_context(engine, fixme); + if (err) + break; + + err = igt_flush_test(gt->i915, I915_WAIT_LOCKED); + if (err) + break; + } + +unlock: + mutex_unlock(>->i915->drm.struct_mutex); + mock_file_free(gt->i915, file); + return err; +} + +int intel_context_live_selftests(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(live_active_context), + SUBTEST(live_remote_context), + }; + struct intel_gt *gt = &i915->gt; + + if (intel_gt_is_wedged(gt)) + return 0; + + return intel_gt_live_subtests(tests, gt); +} diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 22341c62c204..bbb44b072d1a 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -184,6 +184,7 @@ active_instance(struct i915_active *ref, u64 idx) ref->cache = node; mutex_unlock(&ref->mutex); + BUILD_BUG_ON(offsetof(typeof(*node), base)); return &node->base; } @@ -213,6 +214,8 @@ int i915_active_ref(struct i915_active *ref, struct i915_active_request *active; int err; + GEM_BUG_ON(!timeline); /* reserved for idle-barrier */ + /* Prevent reaping in case we malloc/wait while building the tree */ err = i915_active_acquire(ref); if (err) @@ -223,6 +226,7 @@ int i915_active_ref(struct i915_active *ref, err = -ENOMEM; goto out; } + GEM_BUG_ON(IS_ERR(active->request)); if (!i915_active_request_isset(active)) atomic_inc(&ref->count); @@ -374,6 +378,34 @@ void i915_active_fini(struct i915_active *ref) } #endif +static struct active_node *idle_barrier(struct i915_active *ref) +{ + struct active_node *idle = NULL; + struct rb_node *rb; + + if (RB_EMPTY_ROOT(&ref->tree)) + return NULL; + + mutex_lock(&ref->mutex); + for (rb = rb_first(&ref->tree); rb; rb = rb_next(rb)) { + struct active_node *node; + + node = rb_entry(rb, typeof(*node), node); + if (node->timeline) + break; + + if (!i915_active_request_isset(&node->base)) { + GEM_BUG_ON(!list_empty(&node->base.link)); + rb_erase(rb, &ref->tree); + idle = node; + break; + } + } + mutex_unlock(&ref->mutex); + + return idle; +} + int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine) { @@ -383,23 +415,32 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, int err; GEM_BUG_ON(!engine->mask); + GEM_BUG_ON(!llist_empty(&ref->barriers)); + for_each_engine_masked(engine, i915, engine->mask, tmp) { - struct intel_context *kctx = engine->kernel_context; struct active_node *node; - node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); - if (unlikely(!node)) { - err = -ENOMEM; - goto unwind; + node = idle_barrier(ref); + if (!node) { + node = kmem_cache_alloc(global.slab_cache, + GFP_KERNEL | + __GFP_RETRY_MAYFAIL | + __GFP_NOWARN); + if (unlikely(!node)) { + err = -ENOMEM; + goto unwind; + } + + node->ref = ref; + node->timeline = 0; + node->base.retire = node_retire; } - i915_active_request_init(&node->base, - (void *)engine, node_retire); - node->timeline = kctx->ring->timeline->fence_context; - node->ref = ref; + intel_engine_pm_get(engine); + + RCU_INIT_POINTER(node->base.request, (void *)engine); atomic_inc(&ref->count); - intel_engine_pm_get(engine); llist_add((struct llist_node *)&node->base.link, &ref->barriers); } @@ -434,6 +475,7 @@ void i915_active_acquire_barrier(struct i915_active *ref) node = container_of((struct list_head *)pos, typeof(*node), base.link); + GEM_BUG_ON(node->timeline); engine = (void *)rcu_access_pointer(node->base.request); RCU_INIT_POINTER(node->base.request, ERR_PTR(-EAGAIN)); @@ -442,12 +484,7 @@ void i915_active_acquire_barrier(struct i915_active *ref) p = &ref->tree.rb_node; while (*p) { parent = *p; - if (rb_entry(parent, - struct active_node, - node)->timeline < node->timeline) - p = &parent->rb_right; - else - p = &parent->rb_left; + p = &parent->rb_left; } rb_link_node(&node->node, parent, p); rb_insert_color(&node->node, &ref->tree); diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index 2b31a4ee0b4c..a841d3f9bedc 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -15,6 +15,7 @@ selftest(workarounds, intel_workarounds_live_selftests) selftest(timelines, intel_timeline_live_selftests) selftest(requests, i915_request_live_selftests) selftest(active, i915_active_live_selftests) +selftest(gt_contexts, intel_context_live_selftests) selftest(objects, i915_gem_object_live_selftests) selftest(mman, i915_gem_mman_live_selftests) selftest(dmabuf, i915_gem_dmabuf_live_selftests) @@ -24,7 +25,7 @@ selftest(gtt, i915_gem_gtt_live_selftests) selftest(gem, i915_gem_live_selftests) selftest(evict, i915_gem_evict_live_selftests) selftest(hugepages, i915_gem_huge_page_live_selftests) -selftest(contexts, i915_gem_context_live_selftests) +selftest(gem_contexts, i915_gem_context_live_selftests) selftest(blt, i915_gem_object_blt_live_selftests) selftest(client, i915_gem_client_blt_live_selftests) selftest(reset, intel_reset_live_selftests) From patchwork Fri Jul 26 08:45:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF66A13AC for ; Fri, 26 Jul 2019 08:48:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF9CA205FC for ; Fri, 26 Jul 2019 08:48:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B3BC628A3B; Fri, 26 Jul 2019 08:48:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 402D9205FC for ; Fri, 26 Jul 2019 08:48:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B6A176E8B5; Fri, 26 Jul 2019 08:48:14 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BC206E8BF for ; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621076-1500050 for multiple; Fri, 26 Jul 2019 09:46:14 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:49 +0100 Message-Id: <20190726084613.22129-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/27] drm/i915/execlists: Force preemption X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Tvrtko Ursulin Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Kconfig.profile | 12 ++++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 62 ++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 48df8889a88a..3184e8491333 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -25,3 +25,15 @@ config DRM_I915_SPIN_REQUEST May be 0 to disable the initial spin. In practice, we estimate the cost of enabling the interrupt (if currently disabled) to be a few microseconds. + +config DRM_I915_PREEMPT_TIMEOUT + int "Preempt timeout (ms)" + default 100 # milliseconds + help + How long to wait (in milliseconds) for a preemption event to occur + when submitting a new context via execlists. If the current context + does not hit an arbitration point and yield to HW before the timer + expires, the HW will be reset to allow the more important context + to execute. + + May be 0 to disable the timeout. diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 884dfc1cb033..b85ee12c2451 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -945,6 +945,21 @@ static void record_preemption(struct intel_engine_execlists *execlists) (void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++); } +static unsigned long preempt_expires(void) +{ + const unsigned long timeout = + msecs_to_jiffies_timeout(CONFIG_DRM_I915_PREEMPT_TIMEOUT); + + /* + * Paranoia to make sure the compiler computes the timeout before + * loading 'jiffies' as jiffies is volatile and may be updated in + * the background by a timer tick. All to reduce the complexity + * of the addition and reduce the risk of losing a jiffie. + */ + barrier(); + return jiffies + timeout; +} + static void execlists_dequeue(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = &engine->execlists; @@ -1283,6 +1298,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) *port = execlists_schedule_in(last, port - execlists->pending); memset(port + 1, 0, (last_port - port) * sizeof(*port)); execlists_submit_ports(engine); + if (CONFIG_DRM_I915_PREEMPT_TIMEOUT) + mod_timer(&execlists->timer, preempt_expires()); } else { ring_set_paused(engine, 0); } @@ -1467,13 +1484,45 @@ static void process_csb(struct intel_engine_cs *engine) invalidate_csb_entries(&buf[0], &buf[num_entries - 1]); } -static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) +static bool __execlists_submission_tasklet(struct intel_engine_cs *const engine) { lockdep_assert_held(&engine->active.lock); process_csb(engine); - if (!engine->execlists.pending[0]) + if (!engine->execlists.pending[0]) { execlists_dequeue(engine); + return true; + } + + return false; +} + +static void preempt_reset(struct intel_engine_cs *engine) +{ + const unsigned int bit = I915_RESET_ENGINE + engine->id; + unsigned long *lock = &engine->gt->reset.flags; + + if (test_and_set_bit(bit, lock)) + return; + + /* Mark this tasklet as disabled to avoid waiting for it to complete */ + tasklet_disable_nosync(&engine->execlists.tasklet); + + intel_engine_reset(engine, "preemption time out"); + + tasklet_enable(&engine->execlists.tasklet); + clear_and_wake_up_bit(bit, lock); +} + +static bool preempt_timeout(struct intel_engine_cs *const engine) +{ + if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT) + return false; + + if (!intel_engine_has_preemption(engine)) + return false; + + return !timer_pending(&engine->execlists.timer); } /* @@ -1484,10 +1533,17 @@ static void execlists_submission_tasklet(unsigned long data) { struct intel_engine_cs * const engine = (struct intel_engine_cs *)data; unsigned long flags; + bool reset = false; spin_lock_irqsave(&engine->active.lock, flags); - __execlists_submission_tasklet(engine); + + if (!__execlists_submission_tasklet(engine) && preempt_timeout(engine)) + reset = true; + spin_unlock_irqrestore(&engine->active.lock, flags); + + if (reset) + preempt_reset(engine); } static void execlists_submission_timer(struct timer_list *timer) From patchwork Fri Jul 26 08:45:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060539 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDE3A13B1 for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB2CB205FC for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF4AB28A3B; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D83DA28894 for ; Fri, 26 Jul 2019 08:46:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 96D396E8AD; Fri, 26 Jul 2019 08:46:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id DCA236E8A9 for ; Fri, 26 Jul 2019 08:46:30 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621077-1500050 for multiple; Fri, 26 Jul 2019 09:46:15 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:50 +0100 Message-Id: <20190726084613.22129-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/27] drm/i915: Replace hangcheck by heartbeats X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Jon Bloomfield --- Note, strictly this still requires struct_mutex-free retirement for the corner case where the idle-worker is ineffective and we get a backlog of requests on the kernel ring. Or if the legacy ringbuffer is full... When we are able to retire from outside of struct_mutex we can do the full idle-barrier and idle-work from here. --- drivers/gpu/drm/i915/Kconfig.profile | 11 + drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gem/i915_gem_pm.c | 2 - drivers/gpu/drm/i915/gt/intel_engine.h | 32 -- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 10 +- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 99 +++++ .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 17 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 +- drivers/gpu/drm/i915/gt/intel_gt.c | 1 - drivers/gpu/drm/i915/gt/intel_gt.h | 4 - drivers/gpu/drm/i915/gt/intel_gt_pm.c | 2 - drivers/gpu/drm/i915/gt/intel_gt_types.h | 9 - drivers/gpu/drm/i915/gt/intel_hangcheck.c | 360 ------------------ drivers/gpu/drm/i915/gt/intel_reset.c | 3 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 4 - drivers/gpu/drm/i915/i915_debugfs.c | 86 ----- drivers/gpu/drm/i915/i915_drv.c | 6 +- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/i915_gpu_error.c | 37 +- drivers/gpu/drm/i915/i915_gpu_error.h | 2 - drivers/gpu/drm/i915/i915_params.c | 5 - drivers/gpu/drm/i915/i915_params.h | 1 - 23 files changed, 151 insertions(+), 562 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h delete mode 100644 drivers/gpu/drm/i915/gt/intel_hangcheck.c diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 3184e8491333..aafb57f84169 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -37,3 +37,14 @@ config DRM_I915_PREEMPT_TIMEOUT to execute. May be 0 to disable the timeout. + +config DRM_I915_HEARTBEAT_INTERVAL + int "Interval between heartbeat pulses (ms)" + default 2500 # microseconds + help + While active the driver uses a periodic request, a heartbeat, to + check the wellness of the GPU and to regularly flush state changes + (idle barriers). + + May be 0 to disable heartbeats and therefore disable automatic GPU + hang detection. diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 524516251a40..18201852d68d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -73,10 +73,10 @@ gt-y += \ gt/intel_breadcrumbs.o \ gt/intel_context.o \ gt/intel_engine_cs.o \ + gt/intel_engine_heartbeat.o \ gt/intel_engine_pm.o \ gt/intel_gt.o \ gt/intel_gt_pm.o \ - gt/intel_hangcheck.o \ gt/intel_lrc.o \ gt/intel_renderstate.o \ gt/intel_reset.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index b5561cbdc5ea..a5a0aefcc04c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -170,8 +170,6 @@ void i915_gem_suspend(struct drm_i915_private *i915) GEM_BUG_ON(i915->gt.awake); flush_work(&i915->gem.idle_work); - cancel_delayed_work_sync(&i915->gt.hangcheck.work); - i915_gem_drain_freed_objects(i915); intel_uc_suspend(&i915->gt.uc); diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index db5c73ce86ee..38eeb4320c97 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -90,38 +90,6 @@ struct drm_printer; /* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to * do the writes, and that must have qw aligned offsets, simply pretend it's 8b. */ -enum intel_engine_hangcheck_action { - ENGINE_IDLE = 0, - ENGINE_WAIT, - ENGINE_ACTIVE_SEQNO, - ENGINE_ACTIVE_HEAD, - ENGINE_ACTIVE_SUBUNITS, - ENGINE_WAIT_KICK, - ENGINE_DEAD, -}; - -static inline const char * -hangcheck_action_to_str(const enum intel_engine_hangcheck_action a) -{ - switch (a) { - case ENGINE_IDLE: - return "idle"; - case ENGINE_WAIT: - return "wait"; - case ENGINE_ACTIVE_SEQNO: - return "active seqno"; - case ENGINE_ACTIVE_HEAD: - return "active head"; - case ENGINE_ACTIVE_SUBUNITS: - return "active subunits"; - case ENGINE_WAIT_KICK: - return "wait kick"; - case ENGINE_DEAD: - return "dead"; - } - - return "unknown"; -} void intel_engines_set_scheduler_caps(struct drm_i915_private *i915); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 65cbf1d9118d..89f4a3825b77 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -621,7 +621,6 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) intel_engine_init_active(engine, ENGINE_PHYSICAL); intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); - intel_engine_init_hangcheck(engine); intel_engine_init_batch_pool(engine); intel_engine_init_cmd_parser(engine); intel_engine_init__pm(engine); @@ -1453,8 +1452,13 @@ void intel_engine_dump(struct intel_engine_cs *engine, drm_printf(m, "*** WEDGED ***\n"); drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count)); - drm_printf(m, "\tHangcheck: %d ms ago\n", - jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp)); + + rcu_read_lock(); + rq = READ_ONCE(engine->last_heartbeat); + if (rq) + drm_printf(m, "\tHeartbeat: %d ms ago\n", + jiffies_to_msecs(jiffies - rq->emitted_jiffies)); + rcu_read_unlock(); drm_printf(m, "\tReset count: %d (global %d)\n", i915_reset_engine_count(error, engine), i915_reset_count(error)); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c new file mode 100644 index 000000000000..42330b1074e6 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -0,0 +1,99 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_request.h" + +#include "intel_context.h" +#include "intel_engine_heartbeat.h" +#include "intel_engine_pm.h" +#include "intel_gt.h" +#include "intel_reset.h" + +static long delay(void) +{ + const long t = msecs_to_jiffies(CONFIG_DRM_I915_HEARTBEAT_INTERVAL); + + return round_jiffies_up_relative(t); +} + +static void heartbeat(struct work_struct *wrk) +{ + struct intel_engine_cs *engine = + container_of(wrk, typeof(*engine), heartbeat.work); + struct intel_context *ce = engine->kernel_context; + struct i915_request *rq; + + if (!intel_engine_pm_get_if_awake(engine)) + return; + + rq = engine->last_heartbeat; + if (rq && i915_request_completed(rq)) { + i915_request_put(rq); + engine->last_heartbeat = NULL; + } + + if (intel_gt_is_wedged(engine->gt)) + goto out; + + if (engine->last_heartbeat) { + if (engine->schedule && rq->sched.attr.priority < INT_MAX) { + struct i915_sched_attr attr = { .priority = INT_MAX }; + + local_bh_disable(); + engine->schedule(rq, &attr); + local_bh_enable(); + } else { + intel_gt_handle_error(engine->gt, engine->mask, + I915_ERROR_CAPTURE, + "stopped heartbeat on %s", + engine->name); + } + goto out; + } + + if (engine->wakeref_serial == engine->serial) + goto out; + + if (intel_context_timeline_lock(ce)) + goto out; + + intel_context_enter(ce); + rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN); + intel_context_exit(ce); + if (IS_ERR(rq)) + goto unlock; + + engine->last_heartbeat = i915_request_get(rq); + engine->wakeref_serial = engine->serial + 1; + + i915_request_add_barriers(rq); + __i915_request_commit(rq); + +unlock: + intel_context_timeline_unlock(ce); +out: + schedule_delayed_work(&engine->heartbeat, delay()); + intel_engine_pm_put(engine); +} + +void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine) +{ + if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL) + return; + + schedule_delayed_work(&engine->heartbeat, delay()); +} + +void intel_engine_park_heartbeat(struct intel_engine_cs *engine) +{ + cancel_delayed_work_sync(&engine->heartbeat); + i915_request_put(fetch_and_zero(&engine->last_heartbeat)); +} + +void intel_engine_init_heartbeat(struct intel_engine_cs *engine) +{ + INIT_DELAYED_WORK(&engine->heartbeat, heartbeat); +} diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h new file mode 100644 index 000000000000..e04f9c7aa85d --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h @@ -0,0 +1,17 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_ENGINE_HEARTBEAT_H +#define INTEL_ENGINE_HEARTBEAT_H + +struct intel_engine_cs; + +void intel_engine_init_heartbeat(struct intel_engine_cs *engine); + +void intel_engine_park_heartbeat(struct intel_engine_cs *engine); +void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine); + +#endif /* INTEL_ENGINE_HEARTBEAT_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index e74fbf04a68d..6349bb038c72 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -7,6 +7,7 @@ #include "i915_drv.h" #include "intel_engine.h" +#include "intel_engine_heartbeat.h" #include "intel_engine_pm.h" #include "intel_gt.h" #include "intel_gt_pm.h" @@ -32,7 +33,7 @@ static int __engine_unpark(struct intel_wakeref *wf) if (engine->unpark) engine->unpark(engine); - intel_engine_init_hangcheck(engine); + intel_engine_unpark_heartbeat(engine); return 0; } @@ -115,6 +116,7 @@ static int __engine_park(struct intel_wakeref *wf) GEM_TRACE("%s\n", engine->name); + intel_engine_park_heartbeat(engine); intel_engine_disarm_breadcrumbs(engine); /* Must be reset upon idling, or we may miss the busy wakeup. */ @@ -142,4 +144,5 @@ void intel_engine_pm_put(struct intel_engine_cs *engine) void intel_engine_init__pm(struct intel_engine_cs *engine) { intel_wakeref_init(&engine->wakeref); + intel_engine_init_heartbeat(engine); } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 8be63019d707..9a3ead556743 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -14,6 +14,7 @@ #include #include #include +#include #include "i915_gem.h" #include "i915_gem_batch_pool.h" @@ -55,14 +56,6 @@ struct intel_instdone { u32 row[I915_MAX_SLICES][I915_MAX_SUBSLICES]; }; -struct intel_engine_hangcheck { - u64 acthd; - u32 last_ring; - u32 last_head; - unsigned long action_timestamp; - struct intel_instdone instdone; -}; - struct intel_ring { struct kref ref; struct i915_vma *vma; @@ -298,6 +291,9 @@ struct intel_engine_cs { struct drm_i915_gem_object *default_state; void *pinned_default_state; + struct delayed_work heartbeat; + struct i915_request *last_heartbeat; + /* Rather than have every client wait upon all user interrupts, * with the herd waking after every interrupt and each doing the * heavyweight seqno dance, we delegate the task (of being the @@ -437,8 +433,6 @@ struct intel_engine_cs { /* status_notifier: list of callbacks for context-switch changes */ struct atomic_notifier_head context_status_notifier; - struct intel_engine_hangcheck hangcheck; - #define I915_ENGINE_NEEDS_CMD_PARSER BIT(0) #define I915_ENGINE_SUPPORTS_STATS BIT(1) #define I915_ENGINE_HAS_PREEMPTION BIT(2) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index f7e69db4019d..feaf916c9abd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -19,7 +19,6 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915) spin_lock_init(>->closed_lock); - intel_gt_init_hangcheck(gt); intel_gt_init_reset(gt); intel_gt_pm_init_early(gt); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index 640bb0531f5b..54d22913ede3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -39,8 +39,6 @@ void intel_gt_clear_error_registers(struct intel_gt *gt, void intel_gt_flush_ggtt_writes(struct intel_gt *gt); void intel_gt_chipset_flush(struct intel_gt *gt); -void intel_gt_init_hangcheck(struct intel_gt *gt); - int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size); void intel_gt_fini_scratch(struct intel_gt *gt); @@ -55,6 +53,4 @@ static inline bool intel_gt_is_wedged(struct intel_gt *gt) return __intel_reset_failed(>->reset); } -void intel_gt_queue_hangcheck(struct intel_gt *gt); - #endif /* __INTEL_GT_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index 65c0d0c9d543..d223b07ee324 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -46,8 +46,6 @@ static int intel_gt_unpark(struct intel_wakeref *wf) i915_pmu_gt_unparked(i915); - intel_gt_queue_hangcheck(gt); - pm_notify(i915, INTEL_GT_UNPARK); return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 34d4a868e4f1..0112bb98cbf3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -23,14 +23,6 @@ struct drm_i915_private; struct i915_ggtt; struct intel_uncore; -struct intel_hangcheck { - /* For hangcheck timer */ -#define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */ -#define DRM_I915_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD) - - struct delayed_work work; -}; - struct intel_gt { struct drm_i915_private *i915; struct intel_uncore *uncore; @@ -54,7 +46,6 @@ struct intel_gt { struct list_head closed_vma; spinlock_t closed_lock; /* guards the list of closed_vma */ - struct intel_hangcheck hangcheck; struct intel_reset reset; /** diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c deleted file mode 100644 index 05d042cdefe2..000000000000 --- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c +++ /dev/null @@ -1,360 +0,0 @@ -/* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#include "i915_drv.h" -#include "intel_engine.h" -#include "intel_gt.h" -#include "intel_reset.h" - -struct hangcheck { - u64 acthd; - u32 ring; - u32 head; - enum intel_engine_hangcheck_action action; - unsigned long action_timestamp; - int deadlock; - struct intel_instdone instdone; - bool wedged:1; - bool stalled:1; -}; - -static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone) -{ - u32 tmp = current_instdone | *old_instdone; - bool unchanged; - - unchanged = tmp == *old_instdone; - *old_instdone |= tmp; - - return unchanged; -} - -static bool subunits_stuck(struct intel_engine_cs *engine) -{ - struct drm_i915_private *dev_priv = engine->i915; - struct intel_instdone instdone; - struct intel_instdone *accu_instdone = &engine->hangcheck.instdone; - bool stuck; - int slice; - int subslice; - - intel_engine_get_instdone(engine, &instdone); - - /* There might be unstable subunit states even when - * actual head is not moving. Filter out the unstable ones by - * accumulating the undone -> done transitions and only - * consider those as progress. - */ - stuck = instdone_unchanged(instdone.instdone, - &accu_instdone->instdone); - stuck &= instdone_unchanged(instdone.slice_common, - &accu_instdone->slice_common); - - for_each_instdone_slice_subslice(dev_priv, slice, subslice) { - stuck &= instdone_unchanged(instdone.sampler[slice][subslice], - &accu_instdone->sampler[slice][subslice]); - stuck &= instdone_unchanged(instdone.row[slice][subslice], - &accu_instdone->row[slice][subslice]); - } - - return stuck; -} - -static enum intel_engine_hangcheck_action -head_stuck(struct intel_engine_cs *engine, u64 acthd) -{ - if (acthd != engine->hangcheck.acthd) { - - /* Clear subunit states on head movement */ - memset(&engine->hangcheck.instdone, 0, - sizeof(engine->hangcheck.instdone)); - - return ENGINE_ACTIVE_HEAD; - } - - if (!subunits_stuck(engine)) - return ENGINE_ACTIVE_SUBUNITS; - - return ENGINE_DEAD; -} - -static enum intel_engine_hangcheck_action -engine_stuck(struct intel_engine_cs *engine, u64 acthd) -{ - enum intel_engine_hangcheck_action ha; - u32 tmp; - - ha = head_stuck(engine, acthd); - if (ha != ENGINE_DEAD) - return ha; - - if (IS_GEN(engine->i915, 2)) - return ENGINE_DEAD; - - /* Is the chip hanging on a WAIT_FOR_EVENT? - * If so we can simply poke the RB_WAIT bit - * and break the hang. This should work on - * all but the second generation chipsets. - */ - tmp = ENGINE_READ(engine, RING_CTL); - if (tmp & RING_WAIT) { - intel_gt_handle_error(engine->gt, engine->mask, 0, - "stuck wait on %s", engine->name); - ENGINE_WRITE(engine, RING_CTL, tmp); - return ENGINE_WAIT_KICK; - } - - return ENGINE_DEAD; -} - -static void hangcheck_load_sample(struct intel_engine_cs *engine, - struct hangcheck *hc) -{ - hc->acthd = intel_engine_get_active_head(engine); - hc->ring = ENGINE_READ(engine, RING_START); - hc->head = ENGINE_READ(engine, RING_HEAD); -} - -static void hangcheck_store_sample(struct intel_engine_cs *engine, - const struct hangcheck *hc) -{ - engine->hangcheck.acthd = hc->acthd; - engine->hangcheck.last_ring = hc->ring; - engine->hangcheck.last_head = hc->head; -} - -static enum intel_engine_hangcheck_action -hangcheck_get_action(struct intel_engine_cs *engine, - const struct hangcheck *hc) -{ - if (intel_engine_is_idle(engine)) - return ENGINE_IDLE; - - if (engine->hangcheck.last_ring != hc->ring) - return ENGINE_ACTIVE_SEQNO; - - if (engine->hangcheck.last_head != hc->head) - return ENGINE_ACTIVE_SEQNO; - - return engine_stuck(engine, hc->acthd); -} - -static void hangcheck_accumulate_sample(struct intel_engine_cs *engine, - struct hangcheck *hc) -{ - unsigned long timeout = I915_ENGINE_DEAD_TIMEOUT; - - hc->action = hangcheck_get_action(engine, hc); - - /* We always increment the progress - * if the engine is busy and still processing - * the same request, so that no single request - * can run indefinitely (such as a chain of - * batches). The only time we do not increment - * the hangcheck score on this ring, if this - * engine is in a legitimate wait for another - * engine. In that case the waiting engine is a - * victim and we want to be sure we catch the - * right culprit. Then every time we do kick - * the ring, make it as a progress as the seqno - * advancement might ensure and if not, it - * will catch the hanging engine. - */ - - switch (hc->action) { - case ENGINE_IDLE: - case ENGINE_ACTIVE_SEQNO: - /* Clear head and subunit states on seqno movement */ - hc->acthd = 0; - - memset(&engine->hangcheck.instdone, 0, - sizeof(engine->hangcheck.instdone)); - - /* Intentional fall through */ - case ENGINE_WAIT_KICK: - case ENGINE_WAIT: - engine->hangcheck.action_timestamp = jiffies; - break; - - case ENGINE_ACTIVE_HEAD: - case ENGINE_ACTIVE_SUBUNITS: - /* - * Seqno stuck with still active engine gets leeway, - * in hopes that it is just a long shader. - */ - timeout = I915_SEQNO_DEAD_TIMEOUT; - break; - - case ENGINE_DEAD: - break; - - default: - MISSING_CASE(hc->action); - } - - hc->stalled = time_after(jiffies, - engine->hangcheck.action_timestamp + timeout); - hc->wedged = time_after(jiffies, - engine->hangcheck.action_timestamp + - I915_ENGINE_WEDGED_TIMEOUT); -} - -static void hangcheck_declare_hang(struct intel_gt *gt, - intel_engine_mask_t hung, - intel_engine_mask_t stuck) -{ - struct intel_engine_cs *engine; - intel_engine_mask_t tmp; - char msg[80]; - int len; - - /* If some rings hung but others were still busy, only - * blame the hanging rings in the synopsis. - */ - if (stuck != hung) - hung &= ~stuck; - len = scnprintf(msg, sizeof(msg), - "%s on ", stuck == hung ? "no progress" : "hang"); - for_each_engine_masked(engine, gt->i915, hung, tmp) - len += scnprintf(msg + len, sizeof(msg) - len, - "%s, ", engine->name); - msg[len-2] = '\0'; - - return intel_gt_handle_error(gt, hung, I915_ERROR_CAPTURE, "%s", msg); -} - -/* - * This is called when the chip hasn't reported back with completed - * batchbuffers in a long time. We keep track per ring seqno progress and - * if there are no progress, hangcheck score for that ring is increased. - * Further, acthd is inspected to see if the ring is stuck. On stuck case - * we kick the ring. If we see no progress on three subsequent calls - * we assume chip is wedged and try to fix it by resetting the chip. - */ -static void hangcheck_elapsed(struct work_struct *work) -{ - struct intel_gt *gt = - container_of(work, typeof(*gt), hangcheck.work.work); - intel_engine_mask_t hung = 0, stuck = 0, wedged = 0; - struct intel_engine_cs *engine; - enum intel_engine_id id; - intel_wakeref_t wakeref; - - if (!i915_modparams.enable_hangcheck) - return; - - if (!READ_ONCE(gt->awake)) - return; - - if (intel_gt_is_wedged(gt)) - return; - - wakeref = intel_runtime_pm_get_if_in_use(>->i915->runtime_pm); - if (!wakeref) - return; - - /* As enabling the GPU requires fairly extensive mmio access, - * periodically arm the mmio checker to see if we are triggering - * any invalid access. - */ - intel_uncore_arm_unclaimed_mmio_detection(gt->uncore); - - for_each_engine(engine, gt->i915, id) { - struct hangcheck hc; - - intel_engine_signal_breadcrumbs(engine); - - hangcheck_load_sample(engine, &hc); - hangcheck_accumulate_sample(engine, &hc); - hangcheck_store_sample(engine, &hc); - - if (hc.stalled) { - hung |= engine->mask; - if (hc.action != ENGINE_DEAD) - stuck |= engine->mask; - } - - if (hc.wedged) - wedged |= engine->mask; - } - - if (GEM_SHOW_DEBUG() && (hung | stuck)) { - struct drm_printer p = drm_debug_printer("hangcheck"); - - for_each_engine(engine, gt->i915, id) { - if (intel_engine_is_idle(engine)) - continue; - - intel_engine_dump(engine, &p, "%s\n", engine->name); - } - } - - if (wedged) { - dev_err(gt->i915->drm.dev, - "GPU recovery timed out," - " cancelling all in-flight rendering.\n"); - GEM_TRACE_DUMP(); - intel_gt_set_wedged(gt); - } - - if (hung) - hangcheck_declare_hang(gt, hung, stuck); - - intel_runtime_pm_put(>->i915->runtime_pm, wakeref); - - /* Reset timer in case GPU hangs without another request being added */ - intel_gt_queue_hangcheck(gt); -} - -void intel_gt_queue_hangcheck(struct intel_gt *gt) -{ - unsigned long delay; - - if (unlikely(!i915_modparams.enable_hangcheck)) - return; - - /* - * Don't continually defer the hangcheck so that it is always run at - * least once after work has been scheduled on any ring. Otherwise, - * we will ignore a hung ring if a second ring is kept busy. - */ - - delay = round_jiffies_up_relative(DRM_I915_HANGCHECK_JIFFIES); - queue_delayed_work(system_long_wq, >->hangcheck.work, delay); -} - -void intel_engine_init_hangcheck(struct intel_engine_cs *engine) -{ - memset(&engine->hangcheck, 0, sizeof(engine->hangcheck)); - engine->hangcheck.action_timestamp = jiffies; -} - -void intel_gt_init_hangcheck(struct intel_gt *gt) -{ - INIT_DELAYED_WORK(>->hangcheck.work, hangcheck_elapsed); -} - -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) -#include "selftest_hangcheck.c" -#endif diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 98c071fe532b..0cb457538e62 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -978,8 +978,6 @@ void intel_gt_reset(struct intel_gt *gt, if (ret) goto taint; - intel_gt_queue_hangcheck(gt); - finish: reset_finish(gt, awake); unlock: @@ -1305,4 +1303,5 @@ void __intel_fini_wedge(struct intel_wedge_me *w) #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftest_reset.c" +#include "selftest_hangcheck.c" #endif diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index e2fa38a1ff0f..9d64390966e3 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -1732,7 +1732,6 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) }; struct intel_gt *gt = &i915->gt; intel_wakeref_t wakeref; - bool saved_hangcheck; int err; if (!intel_has_gpu_reset(gt->i915)) @@ -1742,8 +1741,6 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) return -EIO; /* we're long past hope of a successful reset */ wakeref = intel_runtime_pm_get(>->i915->runtime_pm); - saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck); - drain_delayed_work(>->hangcheck.work); /* flush param */ err = intel_gt_live_subtests(tests, gt); @@ -1751,7 +1748,6 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) igt_flush_test(gt->i915, I915_WAIT_LOCKED); mutex_unlock(>->i915->drm.struct_mutex); - i915_modparams.enable_hangcheck = saved_hangcheck; intel_runtime_pm_put(>->i915->runtime_pm, wakeref); return err; diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 24787bb48c9f..1f0408ff345b 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1044,91 +1044,6 @@ static int i915_frequency_info(struct seq_file *m, void *unused) return ret; } -static void i915_instdone_info(struct drm_i915_private *dev_priv, - struct seq_file *m, - struct intel_instdone *instdone) -{ - int slice; - int subslice; - - seq_printf(m, "\t\tINSTDONE: 0x%08x\n", - instdone->instdone); - - if (INTEL_GEN(dev_priv) <= 3) - return; - - seq_printf(m, "\t\tSC_INSTDONE: 0x%08x\n", - instdone->slice_common); - - if (INTEL_GEN(dev_priv) <= 6) - return; - - for_each_instdone_slice_subslice(dev_priv, slice, subslice) - seq_printf(m, "\t\tSAMPLER_INSTDONE[%d][%d]: 0x%08x\n", - slice, subslice, instdone->sampler[slice][subslice]); - - for_each_instdone_slice_subslice(dev_priv, slice, subslice) - seq_printf(m, "\t\tROW_INSTDONE[%d][%d]: 0x%08x\n", - slice, subslice, instdone->row[slice][subslice]); -} - -static int i915_hangcheck_info(struct seq_file *m, void *unused) -{ - struct drm_i915_private *i915 = node_to_i915(m->private); - struct intel_gt *gt = &i915->gt; - struct intel_engine_cs *engine; - intel_wakeref_t wakeref; - enum intel_engine_id id; - - seq_printf(m, "Reset flags: %lx\n", gt->reset.flags); - if (test_bit(I915_WEDGED, >->reset.flags)) - seq_puts(m, "\tWedged\n"); - if (test_bit(I915_RESET_BACKOFF, >->reset.flags)) - seq_puts(m, "\tDevice (global) reset in progress\n"); - - if (!i915_modparams.enable_hangcheck) { - seq_puts(m, "Hangcheck disabled\n"); - return 0; - } - - if (timer_pending(>->hangcheck.work.timer)) - seq_printf(m, "Hangcheck active, timer fires in %dms\n", - jiffies_to_msecs(gt->hangcheck.work.timer.expires - - jiffies)); - else if (delayed_work_pending(>->hangcheck.work)) - seq_puts(m, "Hangcheck active, work pending\n"); - else - seq_puts(m, "Hangcheck inactive\n"); - - seq_printf(m, "GT active? %s\n", yesno(gt->awake)); - - with_intel_runtime_pm(&i915->runtime_pm, wakeref) { - for_each_engine(engine, i915, id) { - struct intel_instdone instdone; - - seq_printf(m, "%s: %d ms ago\n", - engine->name, - jiffies_to_msecs(jiffies - - engine->hangcheck.action_timestamp)); - - seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n", - (long long)engine->hangcheck.acthd, - intel_engine_get_active_head(engine)); - - intel_engine_get_instdone(engine, &instdone); - - seq_puts(m, "\tinstdone read =\n"); - i915_instdone_info(i915, m, &instdone); - - seq_puts(m, "\tinstdone accu =\n"); - i915_instdone_info(i915, m, - &engine->hangcheck.instdone); - } - } - - return 0; -} - static int ironlake_drpc_info(struct seq_file *m) { struct drm_i915_private *i915 = node_to_i915(m->private); @@ -4387,7 +4302,6 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_guc_stage_pool", i915_guc_stage_pool, 0}, {"i915_huc_load_status", i915_huc_load_status_info, 0}, {"i915_frequency_info", i915_frequency_info, 0}, - {"i915_hangcheck_info", i915_hangcheck_info, 0}, {"i915_drpc_info", i915_drpc_info, 0}, {"i915_emon_status", i915_emon_status, 0}, {"i915_ring_freq_table", i915_ring_freq_table, 0}, diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f2d3d754af37..8c277fbf9663 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -411,8 +411,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data, return -ENODEV; break; case I915_PARAM_HAS_GPU_RESET: - value = i915_modparams.enable_hangcheck && - intel_has_gpu_reset(dev_priv); + value = intel_has_gpu_reset(dev_priv); if (value && intel_has_reset_engine(dev_priv)) value = 2; break; @@ -1975,10 +1974,7 @@ void i915_driver_remove(struct drm_device *dev) intel_csr_ucode_fini(dev_priv); - /* Free error state after interrupts are fully disabled. */ - cancel_delayed_work_sync(&dev_priv->gt.hangcheck.work); i915_reset_error_state(dev_priv); - i915_gem_driver_remove(dev_priv); intel_power_domains_driver_remove(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 59d4a1146039..0e715b1266f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2375,7 +2375,6 @@ extern const struct dev_pm_ops i915_pm_ops; int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent); void i915_driver_remove(struct drm_device *dev); -void intel_engine_init_hangcheck(struct intel_engine_cs *engine); int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); static inline bool intel_gvt_active(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 674d341a23f6..eff10ca5a788 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -554,10 +554,6 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, } err_printf(m, " ring->head: 0x%08x\n", ee->cpu_ring_head); err_printf(m, " ring->tail: 0x%08x\n", ee->cpu_ring_tail); - err_printf(m, " hangcheck timestamp: %dms (%lu%s)\n", - jiffies_to_msecs(ee->hangcheck_timestamp - epoch), - ee->hangcheck_timestamp, - ee->hangcheck_timestamp == epoch ? "; epoch" : ""); err_printf(m, " engine reset count: %u\n", ee->reset_count); for (n = 0; n < ee->num_ports; n++) { @@ -695,11 +691,8 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, ts = ktime_to_timespec64(error->uptime); err_printf(m, "Uptime: %lld s %ld us\n", (s64)ts.tv_sec, ts.tv_nsec / NSEC_PER_USEC); - err_printf(m, "Epoch: %lu jiffies (%u HZ)\n", error->epoch, HZ); - err_printf(m, "Capture: %lu jiffies; %d ms ago, %d ms after epoch\n", - error->capture, - jiffies_to_msecs(jiffies - error->capture), - jiffies_to_msecs(error->capture - error->epoch)); + err_printf(m, "Capture: %lu jiffies; %d ms ago\n", + error->capture, jiffies_to_msecs(jiffies - error->capture)); for (i = 0; i < ARRAY_SIZE(error->engine); i++) { if (!error->engine[i].context.pid) @@ -760,7 +753,9 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, for (i = 0; i < ARRAY_SIZE(error->engine); i++) { if (error->engine[i].engine_id != -1) - error_print_engine(m, &error->engine[i], error->epoch); + error_print_engine(m, + &error->engine[i], + error->capture); } for (i = 0; i < ARRAY_SIZE(error->engine); i++) { @@ -790,7 +785,7 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, for (j = 0; j < ee->num_requests; j++) error_print_request(m, " ", &ee->requests[j], - error->epoch); + error->capture); } print_error_obj(m, m->i915->engine[i], @@ -1171,8 +1166,6 @@ static void error_record_engine_registers(struct i915_gpu_state *error, } ee->idle = intel_engine_is_idle(engine); - if (!ee->idle) - ee->hangcheck_timestamp = engine->hangcheck.action_timestamp; ee->reset_count = i915_reset_engine_count(&dev_priv->gpu_error, engine); @@ -1673,22 +1666,6 @@ static void capture_params(struct i915_gpu_state *error) i915_params_copy(&error->params, &i915_modparams); } -static unsigned long capture_find_epoch(const struct i915_gpu_state *error) -{ - unsigned long epoch = error->capture; - int i; - - for (i = 0; i < ARRAY_SIZE(error->engine); i++) { - const struct drm_i915_error_engine *ee = &error->engine[i]; - - if (ee->hangcheck_timestamp && - time_before(ee->hangcheck_timestamp, epoch)) - epoch = ee->hangcheck_timestamp; - } - - return epoch; -} - static void capture_finish(struct i915_gpu_state *error) { struct i915_ggtt *ggtt = &error->i915->ggtt; @@ -1740,8 +1717,6 @@ i915_capture_gpu_state(struct drm_i915_private *i915) error->overlay = intel_overlay_capture_error_state(i915); error->display = intel_display_capture_error_state(i915); - error->epoch = capture_find_epoch(error); - capture_finish(error); compress_fini(&compress); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index a24c35107d16..c4ba297ca862 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -34,7 +34,6 @@ struct i915_gpu_state { ktime_t boottime; ktime_t uptime; unsigned long capture; - unsigned long epoch; struct drm_i915_private *i915; @@ -84,7 +83,6 @@ struct i915_gpu_state { int engine_id; /* Software tracked state */ bool idle; - unsigned long hangcheck_timestamp; int num_requests; u32 reset_count; diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 296452f9efe4..e4cd58e90e40 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -77,11 +77,6 @@ i915_param_named(error_capture, bool, 0600, "triaging and debugging hangs."); #endif -i915_param_named_unsafe(enable_hangcheck, bool, 0600, - "Periodically check GPU activity for detecting hangs. " - "WARNING: Disabling this can cause system wide hangs. " - "(default: true)"); - i915_param_named_unsafe(enable_psr, int, 0600, "Enable PSR " "(0=disabled, 1=enabled) " diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index d29ade3b7de6..9a3359c6f6f3 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -68,7 +68,6 @@ struct drm_printer; param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE) \ /* leave bools at the end to not create holes */ \ param(bool, alpha_support, IS_ENABLED(CONFIG_DRM_I915_ALPHA_SUPPORT)) \ - param(bool, enable_hangcheck, true) \ param(bool, prefault_disable, false) \ param(bool, load_detect_test, false) \ param(bool, force_reset_modeset_test, false) \ From patchwork Fri Jul 26 08:45:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060531 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 637211398 for ; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 526B5205FC for ; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43A0F28A3B; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ADBFC205FC for ; Fri, 26 Jul 2019 08:46:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D6BEB6E8AA; Fri, 26 Jul 2019 08:46:26 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id CB8916E8A7 for ; Fri, 26 Jul 2019 08:46:25 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621078-1500050 for multiple; Fri, 26 Jul 2019 09:46:15 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:51 +0100 Message-Id: <20190726084613.22129-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/27] drm/i915/gem: Make caps.scheduler static X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We do not notify userspace when the scheduler capabilities are changed (due to wedging the driver) and as such userspace will expect the caps to be static and unchanging. Make it so, and so we only need to compute our caps once during driver registration. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 6 +++--- drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 4 ++-- drivers/gpu/drm/i915/gt/intel_reset.c | 5 +---- drivers/gpu/drm/i915/i915_drv.c | 4 ++-- drivers/gpu/drm/i915/i915_drv.h | 6 ++++-- drivers/gpu/drm/i915/i915_gem.c | 13 +++++++++++-- drivers/gpu/drm/i915/i915_request.c | 2 -- 7 files changed, 23 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 3f4c6bdcc3c3..b186bb5bfb44 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -460,12 +460,12 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr } /** - * i915_gem_shrinker_register - Register the i915 shrinker + * i915_gem_driver_register__shrinker - Register the i915 shrinker * @i915: i915 device * * This function registers and sets up the i915 shrinker and OOM handler. */ -void i915_gem_shrinker_register(struct drm_i915_private *i915) +void i915_gem_driver_register__shrinker(struct drm_i915_private *i915) { i915->mm.shrinker.scan_objects = i915_gem_shrinker_scan; i915->mm.shrinker.count_objects = i915_gem_shrinker_count; @@ -486,7 +486,7 @@ void i915_gem_shrinker_register(struct drm_i915_private *i915) * * This function unregisters the i915 shrinker and OOM handler. */ -void i915_gem_shrinker_unregister(struct drm_i915_private *i915) +void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915) { WARN_ON(unregister_vmap_purge_notifier(&i915->mm.vmap_notifier)); WARN_ON(unregister_oom_notifier(&i915->mm.oom_notifier)); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 01857c12f12f..50aa7e95124d 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -382,7 +382,7 @@ static bool assert_mmap_offset(struct drm_i915_private *i915, static void disable_retire_worker(struct drm_i915_private *i915) { - i915_gem_shrinker_unregister(i915); + i915_gem_driver_unregister__shrinker(i915); intel_gt_pm_get(&i915->gt); @@ -398,7 +398,7 @@ static void restore_retire_worker(struct drm_i915_private *i915) igt_flush_test(i915, I915_WAIT_LOCKED); mutex_unlock(&i915->drm.struct_mutex); - i915_gem_shrinker_register(i915); + i915_gem_driver_register__shrinker(i915); } static void mmap_offset_lock(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 0cb457538e62..c937fa80cc3e 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -757,11 +757,8 @@ static void __intel_gt_set_wedged(struct intel_gt *gt) if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display) __intel_gt_reset(gt, ALL_ENGINES); - for_each_engine(engine, gt->i915, id) { + for_each_engine(engine, gt->i915, id) engine->submit_request = nop_submit_request; - engine->schedule = NULL; - } - gt->i915->caps.scheduler = 0; /* * Make sure no request can slip through without getting completed by diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8c277fbf9663..d960672ae550 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1711,7 +1711,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) { struct drm_device *dev = &dev_priv->drm; - i915_gem_shrinker_register(dev_priv); + i915_gem_driver_register(dev_priv); i915_pmu_register(dev_priv); /* @@ -1791,7 +1791,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_teardown_sysfs(dev_priv); drm_dev_unplug(&dev_priv->drm); - i915_gem_shrinker_unregister(dev_priv); + i915_gem_driver_unregister(dev_priv); } static void i915_welcome_messages(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0e715b1266f0..0e27563030ab 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2479,6 +2479,8 @@ static inline u32 i915_reset_engine_count(struct i915_gpu_error *error, void i915_gem_init_mmio(struct drm_i915_private *i915); int __must_check i915_gem_init(struct drm_i915_private *dev_priv); int __must_check i915_gem_init_hw(struct drm_i915_private *dev_priv); +void i915_gem_driver_register(struct drm_i915_private *i915); +void i915_gem_driver_unregister(struct drm_i915_private *i915); void i915_gem_driver_remove(struct drm_i915_private *dev_priv); void i915_gem_driver_release(struct drm_i915_private *dev_priv); int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv, @@ -2578,8 +2580,8 @@ unsigned long i915_gem_shrink(struct drm_i915_private *i915, #define I915_SHRINK_WRITEBACK BIT(4) unsigned long i915_gem_shrink_all(struct drm_i915_private *i915); -void i915_gem_shrinker_register(struct drm_i915_private *i915); -void i915_gem_shrinker_unregister(struct drm_i915_private *i915); +void i915_gem_driver_register__shrinker(struct drm_i915_private *i915); +void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915); void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, struct mutex *mutex); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 01dd0d1d9bf6..a84052f247f2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1254,8 +1254,6 @@ int i915_gem_init_hw(struct drm_i915_private *i915) intel_mocs_init_l3cc_table(gt); - intel_engines_set_scheduler_caps(i915); - out: intel_uncore_forcewake_put(uncore, FORCEWAKE_ALL); return ret; @@ -1596,6 +1594,17 @@ int i915_gem_init(struct drm_i915_private *dev_priv) return ret; } +void i915_gem_driver_register(struct drm_i915_private *i915) +{ + i915_gem_driver_register__shrinker(i915); + intel_engines_set_scheduler_caps(i915); +} + +void i915_gem_driver_unregister(struct drm_i915_private *i915) +{ + i915_gem_driver_unregister__shrinker(i915); +} + void i915_gem_driver_remove(struct drm_i915_private *dev_priv) { GEM_BUG_ON(dev_priv->gt.awake); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 8ac7d14ec8c9..81094f250bdb 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1198,7 +1198,6 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) */ local_bh_disable(); i915_sw_fence_commit(&rq->semaphore); - rcu_read_lock(); /* RCU serialisation for set-wedged protection */ if (engine->schedule) { struct i915_sched_attr attr = rq->gem_context->sched; @@ -1228,7 +1227,6 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) engine->schedule(rq, &attr); } - rcu_read_unlock(); i915_sw_fence_commit(&rq->submit); local_bh_enable(); /* Kick the execlists tasklet if just scheduled */ From patchwork Fri Jul 26 08:45:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060561 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 224EA13B1 for ; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1349A28894 for ; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0774D28A3B; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 51C12205FC for ; Fri, 26 Jul 2019 08:46:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 801ED6E8BE; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 66F5A6E8AC for ; Fri, 26 Jul 2019 08:46:30 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621079-1500050 for multiple; Fri, 26 Jul 2019 09:46:15 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:52 +0100 Message-Id: <20190726084613.22129-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/27] drm/i915: Move aliasing_ppgtt underneath its i915_ggtt X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move it under the i915_ggtt to simplify later transformations to enable intel_context.vm. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +- .../drm/i915/gem/selftests/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 69 ++++++++++++------- drivers/gpu/drm/i915/i915_drv.h | 3 - drivers/gpu/drm/i915/i915_gem_gtt.c | 36 +++++----- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 + drivers/gpu/drm/i915/i915_vma.c | 2 +- 7 files changed, 71 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index ffb59d96d4d8..0f6b0678f548 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -459,8 +459,7 @@ __create_context(struct drm_i915_private *i915) i915_gem_context_set_recoverable(ctx); ctx->ring_size = 4 * PAGE_SIZE; - ctx->desc_template = - default_desc_template(i915, &i915->mm.aliasing_ppgtt->vm); + ctx->desc_template = default_desc_template(i915, NULL); for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; @@ -2258,8 +2257,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, args->size = 0; if (ctx->vm) args->value = ctx->vm->total; - else if (to_i915(dev)->mm.aliasing_ppgtt) - args->value = to_i915(dev)->mm.aliasing_ppgtt->vm.total; + else if (to_i915(dev)->ggtt.alias) + args->value = to_i915(dev)->ggtt.alias->vm.total; else args->value = to_i915(dev)->ggtt.vm.total; break; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index db7856f0f31e..bbd17d4b8ffd 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1190,7 +1190,7 @@ static int igt_ctx_readonly(void *arg) goto out_unlock; } - vm = ctx->vm ?: &i915->mm.aliasing_ppgtt->vm; + vm = ctx->vm ?: &i915->ggtt.alias->vm; if (!vm || !vm->has_read_only) { err = 0; goto out_unlock; diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index 1de19dac4a14..b056f25c66f2 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1392,30 +1392,41 @@ static void ring_context_destroy(struct kref *ref) intel_context_free(ce); } -static int __context_pin_ppgtt(struct i915_gem_context *ctx) +static struct i915_address_space *vm_alias(struct intel_context *ce) +{ + struct i915_address_space *vm; + + vm = ce->gem_context->vm; + if (!vm) + vm = &ce->engine->gt->ggtt->alias->vm; + + return vm; +} + +static int __context_pin_ppgtt(struct intel_context *ce) { struct i915_address_space *vm; int err = 0; - vm = ctx->vm ?: &ctx->i915->mm.aliasing_ppgtt->vm; + vm = vm_alias(ce); if (vm) err = gen6_ppgtt_pin(i915_vm_to_ppgtt((vm))); return err; } -static void __context_unpin_ppgtt(struct i915_gem_context *ctx) +static void __context_unpin_ppgtt(struct intel_context *ce) { struct i915_address_space *vm; - vm = ctx->vm ?: &ctx->i915->mm.aliasing_ppgtt->vm; + vm = vm_alias(ce); if (vm) gen6_ppgtt_unpin(i915_vm_to_ppgtt(vm)); } static void ring_context_unpin(struct intel_context *ce) { - __context_unpin_ppgtt(ce->gem_context); + __context_unpin_ppgtt(ce); } static struct i915_vma * @@ -1509,7 +1520,7 @@ static int ring_context_pin(struct intel_context *ce) if (err) return err; - err = __context_pin_ppgtt(ce->gem_context); + err = __context_pin_ppgtt(ce); if (err) goto err_active; @@ -1701,7 +1712,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags) return 0; } -static int remap_l3(struct i915_request *rq, int slice) +static int remap_l3_slice(struct i915_request *rq, int slice) { u32 *cs, *remap_info = rq->i915->l3_parity.remap_info[slice]; int i; @@ -1729,15 +1740,34 @@ static int remap_l3(struct i915_request *rq, int slice) return 0; } +static int remap_l3(struct i915_request *rq) +{ + struct i915_gem_context *ctx = rq->gem_context; + int i, err; + + if (!ctx->remap_slice) + return 0; + + for (i = 0; i < MAX_L3_SLICES; i++) { + if (!(ctx->remap_slice & BIT(i))) + continue; + + err = remap_l3_slice(rq, i); + if (err) + return err; + } + + ctx->remap_slice = 0; + return 0; +} + static int switch_context(struct i915_request *rq) { struct intel_engine_cs *engine = rq->engine; - struct i915_gem_context *ctx = rq->gem_context; - struct i915_address_space *vm = - ctx->vm ?: &rq->i915->mm.aliasing_ppgtt->vm; + struct i915_address_space *vm = vm_alias(rq->hw_context); unsigned int unwind_mm = 0; u32 hw_flags = 0; - int ret, i; + int ret; GEM_BUG_ON(HAS_EXECLISTS(rq->i915)); @@ -1781,7 +1811,7 @@ static int switch_context(struct i915_request *rq) * as nothing actually executes using the kernel context; it * is purely used for flushing user contexts. */ - if (i915_gem_context_is_kernel(ctx)) + if (i915_gem_context_is_kernel(rq->gem_context)) hw_flags = MI_RESTORE_INHIBIT; ret = mi_set_context(rq, hw_flags); @@ -1815,18 +1845,9 @@ static int switch_context(struct i915_request *rq) goto err_mm; } - if (ctx->remap_slice) { - for (i = 0; i < MAX_L3_SLICES; i++) { - if (!(ctx->remap_slice & BIT(i))) - continue; - - ret = remap_l3(rq, i); - if (ret) - goto err_mm; - } - - ctx->remap_slice = 0; - } + ret = remap_l3(rq); + if (ret) + goto err_mm; return 0; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0e27563030ab..ab548ec3da1c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -784,9 +784,6 @@ struct i915_gem_mm { */ struct vfsmount *gemfs; - /** PPGTT used for aliasing the PPGTT with the GTT */ - struct i915_ppgtt *aliasing_ppgtt; - struct notifier_block oom_notifier; struct notifier_block vmap_notifier; struct shrinker shrinker; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 4dd1fa956143..8304b98b0bf8 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2446,18 +2446,18 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, pte_flags |= PTE_READ_ONLY; if (flags & I915_VMA_LOCAL_BIND) { - struct i915_ppgtt *appgtt = i915->mm.aliasing_ppgtt; + struct i915_ppgtt *alias = i915_vm_to_ggtt(vma->vm)->alias; if (!(vma->flags & I915_VMA_LOCAL_BIND)) { - ret = appgtt->vm.allocate_va_range(&appgtt->vm, - vma->node.start, - vma->size); + ret = alias->vm.allocate_va_range(&alias->vm, + vma->node.start, + vma->size); if (ret) return ret; } - appgtt->vm.insert_entries(&appgtt->vm, vma, cache_level, - pte_flags); + alias->vm.insert_entries(&alias->vm, vma, + cache_level, pte_flags); } if (flags & I915_VMA_GLOBAL_BIND) { @@ -2485,7 +2485,8 @@ static void aliasing_gtt_unbind_vma(struct i915_vma *vma) } if (vma->flags & I915_VMA_LOCAL_BIND) { - struct i915_address_space *vm = &i915->mm.aliasing_ppgtt->vm; + struct i915_address_space *vm = + &i915_vm_to_ggtt(vma->vm)->alias->vm; vm->clear_range(vm, vma->node.start, vma->size); } @@ -2542,13 +2543,12 @@ static void i915_gtt_color_adjust(const struct drm_mm_node *node, *end -= I915_GTT_PAGE_SIZE; } -static int init_aliasing_ppgtt(struct drm_i915_private *i915) +static int init_aliasing_ppgtt(struct i915_ggtt *ggtt) { - struct i915_ggtt *ggtt = &i915->ggtt; struct i915_ppgtt *ppgtt; int err; - ppgtt = i915_ppgtt_create(i915); + ppgtt = i915_ppgtt_create(ggtt->vm.i915); if (IS_ERR(ppgtt)) return PTR_ERR(ppgtt); @@ -2567,7 +2567,7 @@ static int init_aliasing_ppgtt(struct drm_i915_private *i915) if (err) goto err_ppgtt; - i915->mm.aliasing_ppgtt = ppgtt; + ggtt->alias = ppgtt; GEM_BUG_ON(ggtt->vm.vma_ops.bind_vma != ggtt_bind_vma); ggtt->vm.vma_ops.bind_vma = aliasing_gtt_bind_vma; @@ -2582,14 +2582,14 @@ static int init_aliasing_ppgtt(struct drm_i915_private *i915) return err; } -static void fini_aliasing_ppgtt(struct drm_i915_private *i915) +static void fini_aliasing_ppgtt(struct i915_ggtt *ggtt) { - struct i915_ggtt *ggtt = &i915->ggtt; + struct drm_i915_private *i915 = ggtt->vm.i915; struct i915_ppgtt *ppgtt; mutex_lock(&i915->drm.struct_mutex); - ppgtt = fetch_and_zero(&i915->mm.aliasing_ppgtt); + ppgtt = fetch_and_zero(&ggtt->alias); if (!ppgtt) goto out; @@ -2706,7 +2706,7 @@ int i915_init_ggtt(struct drm_i915_private *i915) return ret; if (INTEL_PPGTT(i915) == INTEL_PPGTT_ALIASING) { - ret = init_aliasing_ppgtt(i915); + ret = init_aliasing_ppgtt(&i915->ggtt); if (ret) cleanup_init_ggtt(&i915->ggtt); } @@ -2752,7 +2752,7 @@ void i915_ggtt_driver_release(struct drm_i915_private *i915) { struct pagevec *pvec; - fini_aliasing_ppgtt(i915); + fini_aliasing_ppgtt(&i915->ggtt); ggtt_cleanup_hw(&i915->ggtt); @@ -3585,7 +3585,7 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE)); GEM_BUG_ON(!IS_ALIGNED(offset, I915_GTT_MIN_ALIGNMENT)); GEM_BUG_ON(range_overflows(offset, size, vm->total)); - GEM_BUG_ON(vm == &vm->i915->mm.aliasing_ppgtt->vm); + GEM_BUG_ON(vm == &vm->i915->ggtt.alias->vm); GEM_BUG_ON(drm_mm_node_allocated(node)); node->size = size; @@ -3682,7 +3682,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, GEM_BUG_ON(start >= end); GEM_BUG_ON(start > 0 && !IS_ALIGNED(start, I915_GTT_PAGE_SIZE)); GEM_BUG_ON(end < U64_MAX && !IS_ALIGNED(end, I915_GTT_PAGE_SIZE)); - GEM_BUG_ON(vm == &vm->i915->mm.aliasing_ppgtt->vm); + GEM_BUG_ON(vm == &vm->i915->ggtt.alias->vm); GEM_BUG_ON(drm_mm_node_allocated(node)); if (unlikely(range_overflows(start, size, end))) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index cea59ef1a365..51274483502e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -394,6 +394,9 @@ struct i915_ggtt { void __iomem *gsm; void (*invalidate)(struct i915_ggtt *ggtt); + /** PPGTT used for aliasing the PPGTT with the GTT */ + struct i915_ppgtt *alias; + bool do_idle_maps; int mtrr; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index ee73baf29415..eb16a1a93bbc 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -104,7 +104,7 @@ vma_create(struct drm_i915_gem_object *obj, struct rb_node *rb, **p; /* The aliasing_ppgtt should never be used directly! */ - GEM_BUG_ON(vm == &vm->i915->mm.aliasing_ppgtt->vm); + GEM_BUG_ON(vm == &vm->i915->ggtt.alias->vm); vma = i915_vma_alloc(); if (vma == NULL) From patchwork Fri Jul 26 08:45:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060567 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA86D912 for ; Fri, 26 Jul 2019 08:48:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA895205FC for ; Fri, 26 Jul 2019 08:48:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE9CF28A3B; Fri, 26 Jul 2019 08:48:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 29EBF205FC for ; Fri, 26 Jul 2019 08:48:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F1D0B6E8B0; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 56A186E8B0 for ; Fri, 26 Jul 2019 08:47:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621080-1500050 for multiple; Fri, 26 Jul 2019 09:46:15 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:53 +0100 Message-Id: <20190726084613.22129-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/27] drm/i915/gt: Provide a local intel_context.vm X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Track the currently bound address space used by the HW context. Minor conversions to use the local intel_context.vm are made, leaving behind some more surgery required to make intel_context the primary through the selftests. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gem/i915_gem_client_blt.c | 4 +--- drivers/gpu/drm/i915/gem/i915_gem_context.c | 15 +++++++++++---- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 11 +++-------- drivers/gpu/drm/i915/gem/i915_gem_object_blt.c | 6 +----- .../gpu/drm/i915/gem/selftests/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_context.c | 4 ++++ drivers/gpu/drm/i915/gt/intel_context_types.h | 4 +++- drivers/gpu/drm/i915/gt/intel_lrc.c | 9 +++------ drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 6 +++--- drivers/gpu/drm/i915/gvt/scheduler.c | 2 +- 10 files changed, 31 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 6f537e8e4dea..2312a0c6af89 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -250,13 +250,11 @@ int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj, u32 value) { struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct i915_gem_context *ctx = ce->gem_context; - struct i915_address_space *vm = ctx->vm ?: &i915->ggtt.vm; struct clear_pages_work *work; struct i915_sleeve *sleeve; int err; - sleeve = create_sleeve(vm, obj, pages, page_sizes); + sleeve = create_sleeve(ce->vm, obj, pages, page_sizes); if (IS_ERR(sleeve)) return PTR_ERR(sleeve); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 0f6b0678f548..b28c7ca681a8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -475,10 +475,18 @@ static struct i915_address_space * __set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm) { struct i915_address_space *old = ctx->vm; + struct i915_gem_engines_iter it; + struct intel_context *ce; ctx->vm = i915_vm_get(vm); ctx->desc_template = default_desc_template(ctx->i915, vm); + for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { + i915_vm_put(ce->vm); + ce->vm = i915_vm_get(vm); + } + i915_gem_context_unlock_engines(ctx); + return old; } @@ -1004,7 +1012,7 @@ static void set_ppgtt_barrier(void *data) static int emit_ppgtt_update(struct i915_request *rq, void *data) { - struct i915_address_space *vm = rq->gem_context->vm; + struct i915_address_space *vm = rq->hw_context->vm; struct intel_engine_cs *engine = rq->engine; u32 base = engine->mmio_base; u32 *cs; @@ -1113,9 +1121,8 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv, set_ppgtt_barrier, old); if (err) { - ctx->vm = old; - ctx->desc_template = default_desc_template(ctx->i915, old); - i915_vm_put(vm); + i915_vm_put(__set_ppgtt(ctx, old)); + i915_vm_put(old); } unlock: diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 8a2047c4e7c3..cbd7c6e3a1f8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -223,7 +223,6 @@ struct i915_execbuffer { struct intel_engine_cs *engine; /** engine to queue the request to */ struct intel_context *context; /* logical state for the request */ struct i915_gem_context *gem_context; /** caller's context */ - struct i915_address_space *vm; /** GTT and vma for the request */ struct i915_request *request; /** our request to build */ struct i915_vma *batch; /** identity of the batch obj/vma */ @@ -697,7 +696,7 @@ static int eb_reserve(struct i915_execbuffer *eb) case 1: /* Too fragmented, unbind everything and retry */ - err = i915_gem_evict_vm(eb->vm); + err = i915_gem_evict_vm(eb->context->vm); if (err) return err; break; @@ -725,12 +724,8 @@ static int eb_select_context(struct i915_execbuffer *eb) return -ENOENT; eb->gem_context = ctx; - if (ctx->vm) { - eb->vm = ctx->vm; + if (ctx->vm) eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT; - } else { - eb->vm = &eb->i915->ggtt.vm; - } eb->context_flags = 0; if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags)) @@ -832,7 +827,7 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) goto err_vma; } - vma = i915_vma_instance(obj, eb->vm, NULL); + vma = i915_vma_instance(obj, eb->context->vm, NULL); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto err_obj; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c index cb42e3a312e2..685064af32d1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c @@ -47,15 +47,11 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj, struct intel_context *ce, u32 value) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct i915_gem_context *ctx = ce->gem_context; - struct i915_address_space *vm = ctx->vm ?: &i915->ggtt.vm; struct i915_request *rq; struct i915_vma *vma; int err; - /* XXX: ce->vm please */ - vma = i915_vma_instance(obj, vm, NULL); + vma = i915_vma_instance(obj, ce->vm, NULL); if (IS_ERR(vma)) return PTR_ERR(vma); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index bbd17d4b8ffd..7f9f6701b32c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -747,7 +747,7 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, GEM_BUG_ON(!intel_engine_can_store_dword(ce->engine)); - vma = i915_vma_instance(obj, ce->gem_context->vm, NULL); + vma = i915_vma_instance(obj, ce->vm, NULL); if (IS_ERR(vma)) return PTR_ERR(vma); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 211ac6568a5d..34c8e37a73b8 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -209,6 +209,8 @@ intel_context_init(struct intel_context *ce, kref_init(&ce->ref); ce->gem_context = ctx; + ce->vm = i915_vm_get(ctx->vm ?: &engine->gt->ggtt->vm); + ce->engine = engine; ce->ops = engine->cops; ce->sseu = engine->sseu; @@ -224,6 +226,8 @@ intel_context_init(struct intel_context *ce, void intel_context_fini(struct intel_context *ce) { + i915_vm_put(ce->vm); + mutex_destroy(&ce->pin_mutex); i915_active_fini(&ce->active); } diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 4c0e211c715d..68a7e979b1a9 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -36,7 +36,6 @@ struct intel_context_ops { struct intel_context { struct kref ref; - struct i915_gem_context *gem_context; struct intel_engine_cs *engine; struct intel_engine_cs *inflight; #define intel_context_inflight(ce) ptr_mask_bits((ce)->inflight, 2) @@ -44,6 +43,9 @@ struct intel_context { #define intel_context_inflight_inc(ce) ptr_count_inc(&(ce)->inflight) #define intel_context_inflight_dec(ce) ptr_count_dec(&(ce)->inflight) + struct i915_address_space *vm; + struct i915_gem_context *gem_context; + struct list_head signal_link; struct list_head signals; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index b85ee12c2451..d46a6d9b3316 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1662,8 +1662,6 @@ __execlists_context_pin(struct intel_context *ce, void *vaddr; int ret; - GEM_BUG_ON(!ce->gem_context->vm); - ret = execlists_context_deferred_alloc(ce, engine); if (ret) goto err; @@ -1773,8 +1771,7 @@ static int gen8_emit_init_breadcrumb(struct i915_request *rq) static int emit_pdps(struct i915_request *rq) { const struct intel_engine_cs * const engine = rq->engine; - struct i915_ppgtt * const ppgtt = - i915_vm_to_ppgtt(rq->gem_context->vm); + struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(rq->hw_context->vm); int err, i; u32 *cs; @@ -1847,7 +1844,7 @@ static int execlists_request_alloc(struct i915_request *request) */ /* Unconditionally invalidate GPU caches and TLBs. */ - if (i915_vm_is_4lvl(request->gem_context->vm)) + if (i915_vm_is_4lvl(request->hw_context->vm)) ret = request->engine->emit_flush(request, EMIT_INVALIDATE); else ret = emit_pdps(request); @@ -2997,7 +2994,7 @@ static void execlists_init_reg_state(u32 *regs, struct intel_engine_cs *engine, struct intel_ring *ring) { - struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(ce->gem_context->vm); + struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(ce->vm); bool rcs = engine->class == RENDER_CLASS; u32 base = engine->mmio_base; diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index b056f25c66f2..38ec11ae6ed7 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1396,9 +1396,9 @@ static struct i915_address_space *vm_alias(struct intel_context *ce) { struct i915_address_space *vm; - vm = ce->gem_context->vm; - if (!vm) - vm = &ce->engine->gt->ggtt->alias->vm; + vm = ce->vm; + if (i915_is_ggtt(vm)) + vm = &i915_vm_to_ggtt(vm)->alias->vm; return vm; } diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index 2144fb46d0e1..f68798ab1e7c 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -1156,7 +1156,7 @@ void intel_vgpu_clean_submission(struct intel_vgpu *vgpu) intel_vgpu_select_submission_ops(vgpu, ALL_ENGINES, 0); - i915_context_ppgtt_root_restore(s, i915_vm_to_ppgtt(s->shadow[0]->gem_context->vm)); + i915_context_ppgtt_root_restore(s, i915_vm_to_ppgtt(s->shadow[0]->vm)); for_each_engine(engine, vgpu->gvt->dev_priv, id) intel_context_unpin(s->shadow[id]); From patchwork Fri Jul 26 08:45:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060551 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F068D13A0 for ; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFA6E28A3B for ; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D406628A5A; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6EE2D28A5C for ; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A4AF96E8B4; Fri, 26 Jul 2019 08:46:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 399D86E8A9 for ; Fri, 26 Jul 2019 08:46:30 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621081-1500050 for multiple; Fri, 26 Jul 2019 09:46:15 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:54 +0100 Message-Id: <20190726084613.22129-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/27] drm/i915: Remove lrc default desc from GEM context X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We only compute the lrc_descriptor() on pinning the context, i.e. infrequently, so we do not benefit from storing the template as the addressing mode is also fixed for the lifetime of the intel_context. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 28 ++----------------- .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 -- drivers/gpu/drm/i915/gt/intel_lrc.c | 12 +++++--- drivers/gpu/drm/i915/gvt/scheduler.c | 3 -- 4 files changed, 10 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index b28c7ca681a8..1b3dc7258ef2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -397,30 +397,6 @@ static void context_close(struct i915_gem_context *ctx) i915_gem_context_put(ctx); } -static u32 default_desc_template(const struct drm_i915_private *i915, - const struct i915_address_space *vm) -{ - u32 address_mode; - u32 desc; - - desc = GEN8_CTX_VALID | GEN8_CTX_PRIVILEGE; - - address_mode = INTEL_LEGACY_32B_CONTEXT; - if (vm && i915_vm_is_4lvl(vm)) - address_mode = INTEL_LEGACY_64B_CONTEXT; - desc |= address_mode << GEN8_CTX_ADDRESSING_MODE_SHIFT; - - if (IS_GEN(i915, 8)) - desc |= GEN8_CTX_L3LLC_COHERENT; - - /* TODO: WaDisableLiteRestore when we start using semaphore - * signalling between Command Streamers - * ring->ctx_desc_template |= GEN8_CTX_FORCE_RESTORE; - */ - - return desc; -} - static struct i915_gem_context * __create_context(struct drm_i915_private *i915) { @@ -459,7 +435,6 @@ __create_context(struct drm_i915_private *i915) i915_gem_context_set_recoverable(ctx); ctx->ring_size = 4 * PAGE_SIZE; - ctx->desc_template = default_desc_template(i915, NULL); for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; @@ -478,8 +453,9 @@ __set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm) struct i915_gem_engines_iter it; struct intel_context *ce; + GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old)); + ctx->vm = i915_vm_get(vm); - ctx->desc_template = default_desc_template(ctx->i915, vm); for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { i915_vm_put(ce->vm); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 0ee61482ef94..a02d98494078 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -171,8 +171,6 @@ struct i915_gem_context { /** ring_size: size for allocating the per-engine ring buffer */ u32 ring_size; - /** desc_template: invariant fields for the HW context descriptor */ - u32 desc_template; /** guilty_count: How many times this context has caused a GPU hang. */ atomic_t guilty_count; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d46a6d9b3316..751799d5d1a9 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -418,13 +418,17 @@ lrc_descriptor(struct intel_context *ce, struct intel_engine_cs *engine) BUILD_BUG_ON(MAX_CONTEXT_HW_ID > (BIT(GEN8_CTX_ID_WIDTH))); BUILD_BUG_ON(GEN11_MAX_CONTEXT_HW_ID > (BIT(GEN11_SW_CTX_ID_WIDTH))); - desc = ctx->desc_template; /* bits 0-11 */ - GEM_BUG_ON(desc & GENMASK_ULL(63, 12)); + desc = INTEL_LEGACY_32B_CONTEXT; + if (i915_vm_is_4lvl(ce->vm)) + desc = INTEL_LEGACY_64B_CONTEXT; + desc <<= GEN8_CTX_ADDRESSING_MODE_SHIFT; + + desc |= GEN8_CTX_VALID | GEN8_CTX_PRIVILEGE; + if (IS_GEN(engine->i915, 8)) + desc |= GEN8_CTX_L3LLC_COHERENT; desc |= i915_ggtt_offset(ce->state) + LRC_HEADER_PAGES * PAGE_SIZE; /* bits 12-31 */ - GEM_BUG_ON(desc & GENMASK_ULL(63, 32)); - /* * The following 32bits are copied into the OA reports (dword 2). * Consider updating oa_get_render_ctx_id in i915_perf.c when changing diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index f68798ab1e7c..4c018fb1359c 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -291,9 +291,6 @@ shadow_context_descriptor_update(struct intel_context *ce, * Update bits 0-11 of the context descriptor which includes flags * like GEN8_CTX_* cached in desc_template */ - desc &= U64_MAX << 12; - desc |= ce->gem_context->desc_template & ((1ULL << 12) - 1); - desc &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT); desc |= workload->ctx_desc.addressing_mode << GEN8_CTX_ADDRESSING_MODE_SHIFT; From patchwork Fri Jul 26 08:45:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060541 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0CCEB13A0 for ; Fri, 26 Jul 2019 08:46:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F13AB28894 for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E580628A59; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5F16A28A3D for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D851B6E8AF; Fri, 26 Jul 2019 08:46:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0F95F6E8A7 for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621082-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:55 +0100 Message-Id: <20190726084613.22129-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/27] drm/i915: Push the ring creation flags to the backend X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Push the ring creation flags from the outer GEM context to the inner intel_cotnext to avoid an unsightly back-reference from inside the backend. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 +++++++++++++------ .../gpu/drm/i915/gem/i915_gem_context_types.h | 3 --- drivers/gpu/drm/i915/gt/intel_context.c | 1 + drivers/gpu/drm/i915/gt/intel_context.h | 5 ++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 5 ++-- drivers/gpu/drm/i915/gt/mock_engine.c | 12 +++------ drivers/gpu/drm/i915/i915_debugfs.c | 23 +++++++++++------ 7 files changed, 45 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 1b3dc7258ef2..18b226bc5e3a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -296,6 +296,8 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) return ERR_CAST(ce); } + ce->ring = __intel_context_ring_size(SZ_16K); + e->engines[id] = ce; } e->num_engines = id; @@ -434,8 +436,6 @@ __create_context(struct drm_i915_private *i915) i915_gem_context_set_bannable(ctx); i915_gem_context_set_recoverable(ctx); - ctx->ring_size = 4 * PAGE_SIZE; - for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; @@ -565,8 +565,15 @@ i915_gem_context_create_gvt(struct drm_device *dev) i915_gem_context_set_closed(ctx); /* not user accessible */ i915_gem_context_clear_bannable(ctx); i915_gem_context_set_force_single_submission(ctx); - if (!USES_GUC_SUBMISSION(to_i915(dev))) - ctx->ring_size = 512 * PAGE_SIZE; /* Max ring buffer size */ + if (!USES_GUC_SUBMISSION(to_i915(dev))) { + const unsigned long ring_size = 512 * SZ_4K; /* max */ + struct i915_gem_engines_iter it; + struct intel_context *ce; + + for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) + ce->ring = __intel_context_ring_size(ring_size); + i915_gem_context_unlock_engines(ctx); + } GEM_BUG_ON(i915_gem_context_is_kernel(ctx)); out: @@ -605,7 +612,6 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio) i915_gem_context_clear_bannable(ctx); ctx->sched.priority = I915_USER_PRIORITY(prio); - ctx->ring_size = PAGE_SIZE; GEM_BUG_ON(!i915_gem_context_is_kernel(ctx)); @@ -1589,6 +1595,7 @@ set_engines(struct i915_gem_context *ctx, for (n = 0; n < num_engines; n++) { struct i915_engine_class_instance ci; struct intel_engine_cs *engine; + struct intel_context *ce; if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) { __free_engines(set.engines, n); @@ -1611,11 +1618,15 @@ set_engines(struct i915_gem_context *ctx, return -ENOENT; } - set.engines->engines[n] = intel_context_create(ctx, engine); - if (!set.engines->engines[n]) { + ce = intel_context_create(ctx, engine); + if (!ce) { __free_engines(set.engines, n); return -ENOMEM; } + + ce->ring = __intel_context_ring_size(SZ_16K); + + set.engines->engines[n] = ce; } set.engines->num_engines = num_engines; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index a02d98494078..260d59cc3de8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -169,9 +169,6 @@ struct i915_gem_context { struct i915_sched_attr sched; - /** ring_size: size for allocating the per-engine ring buffer */ - u32 ring_size; - /** guilty_count: How many times this context has caused a GPU hang. */ atomic_t guilty_count; /** diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 34c8e37a73b8..8f645737cde0 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -214,6 +214,7 @@ intel_context_init(struct intel_context *ce, ce->engine = engine; ce->ops = engine->cops; ce->sseu = engine->sseu; + ce->ring = __intel_context_ring_size(SZ_4K); INIT_LIST_HEAD(&ce->signal_link); INIT_LIST_HEAD(&ce->signals); diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 07f9924de48f..13f28dd316bc 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -136,4 +136,9 @@ int intel_context_prepare_remote_request(struct intel_context *ce, struct i915_request *intel_context_create_request(struct intel_context *ce); +static inline struct intel_ring *__intel_context_ring_size(u64 sz) +{ + return u64_to_ptr(struct intel_ring, sz); +} + #endif /* __INTEL_CONTEXT_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 751799d5d1a9..c899ce34d0d3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3196,9 +3196,8 @@ static int execlists_context_deferred_alloc(struct intel_context *ce, goto error_deref_obj; } - ring = intel_engine_create_ring(engine, - timeline, - ce->gem_context->ring_size); + ring = intel_engine_create_ring(engine, timeline, + (unsigned long)ce->ring); intel_timeline_put(timeline); if (IS_ERR(ring)) { ret = PTR_ERR(ring); diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 10cb312462e5..ba076886e79e 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -131,6 +131,7 @@ static void hw_delay_complete(struct timer_list *t) static void mock_context_unpin(struct intel_context *ce) { mock_timeline_unpin(ce->ring->timeline); + mock_ring_free(ce->ring); } static void mock_context_destroy(struct kref *ref) @@ -139,9 +140,6 @@ static void mock_context_destroy(struct kref *ref) GEM_BUG_ON(intel_context_is_pinned(ce)); - if (ce->ring) - mock_ring_free(ce->ring); - intel_context_fini(ce); intel_context_free(ce); } @@ -150,11 +148,9 @@ static int mock_context_pin(struct intel_context *ce) { int ret; - if (!ce->ring) { - ce->ring = mock_ring(ce->engine); - if (!ce->ring) - return -ENOMEM; - } + ce->ring = mock_ring(ce->engine); + if (!ce->ring) + return -ENOMEM; ret = intel_context_active_acquire(ce); if (ret) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 1f0408ff345b..63d4a7842a23 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -328,10 +328,14 @@ static void print_context_stats(struct seq_file *m, for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { - if (ce->state) - per_file_stats(0, ce->state->obj, &kstats); - if (ce->ring) + intel_context_lock_pinned(ce); + if (intel_context_is_pinned(ce)) { + if (ce->state) + per_file_stats(0, + ce->state->obj, &kstats); per_file_stats(0, ce->ring->vma->obj, &kstats); + } + intel_context_unlock_pinned(ce); } i915_gem_context_unlock_engines(ctx); @@ -1592,12 +1596,15 @@ static int i915_context_status(struct seq_file *m, void *unused) for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { - seq_printf(m, "%s: ", ce->engine->name); - if (ce->state) - describe_obj(m, ce->state->obj); - if (ce->ring) + intel_context_lock_pinned(ce); + if (intel_context_is_pinned(ce)) { + seq_printf(m, "%s: ", ce->engine->name); + if (ce->state) + describe_obj(m, ce->state->obj); describe_ctx_ring(m, ce->ring); - seq_putc(m, '\n'); + seq_putc(m, '\n'); + } + intel_context_unlock_pinned(ce); } i915_gem_context_unlock_engines(ctx); From patchwork Fri Jul 26 08:45:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060533 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 88BC61398 for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A27E205FC for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D19928A3B; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 240D1205FC for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 94C1F6E8AB; Fri, 26 Jul 2019 08:46:27 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D237E6E8A9 for ; Fri, 26 Jul 2019 08:46:25 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621083-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:56 +0100 Message-Id: <20190726084613.22129-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/27] drm/i915: Flush extra hard after writing relocations through the GTT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stable@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Recently discovered in commit bdae33b8b82b ("drm/i915: Use maximum write flush for pwrite_gtt") was that we needed to our full write barrier before changing the GGTT PTE to ensure that our indirect writes through the GTT landed before the PTE changed (and the writes end up in a different page). That also applies to our GGTT relocation path. Signed-off-by: Chris Wilson Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cbd7c6e3a1f8..4db4463089ce 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1014,11 +1014,12 @@ static void reloc_cache_reset(struct reloc_cache *cache) kunmap_atomic(vaddr); i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm); } else { - wmb(); + struct i915_ggtt *ggtt = cache_to_ggtt(cache); + + intel_gt_flush_ggtt_writes(ggtt->vm.gt); io_mapping_unmap_atomic((void __iomem *)vaddr); - if (cache->node.allocated) { - struct i915_ggtt *ggtt = cache_to_ggtt(cache); + if (cache->node.allocated) { ggtt->vm.clear_range(&ggtt->vm, cache->node.start, cache->node.size); @@ -1073,6 +1074,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, void *vaddr; if (cache->vaddr) { + intel_gt_flush_ggtt_writes(ggtt->vm.gt); io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr)); } else { struct i915_vma *vma; @@ -1114,7 +1116,6 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, offset = cache->node.start; if (cache->node.allocated) { - wmb(); ggtt->vm.insert_page(&ggtt->vm, i915_gem_object_get_dma_address(obj, page), offset, I915_CACHE_NONE, 0); From patchwork Fri Jul 26 08:45:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060569 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6DAC912 for ; Fri, 26 Jul 2019 08:48:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97C16205FC for ; Fri, 26 Jul 2019 08:48:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8C1F128A3D; Fri, 26 Jul 2019 08:48:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C9FAA205FC for ; Fri, 26 Jul 2019 08:48:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0230F6E8BF; Fri, 26 Jul 2019 08:48:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 329646E8B5 for ; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621084-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:57 +0100 Message-Id: <20190726084613.22129-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/27] drm/i915: Hide unshrinkable context objects from the shrinker X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The shrinker cannot touch objects used by the contexts (logical state and ring). Currently we mark those as "pin_global" to let the shrinker skip over them, however, if we remove them from the shrinker lists entirely, we don't event have to include them in our shrink accounting. By keeping the unshrinkable objects in our shrinker tracking, we report a large number of objects available to be shrunk, and leave the shrinker deeply unsatisfied when we fail to reclaim those. The shrinker will persist in trying to reclaim the unavailable objects, forcing the system into a livelock (not even hitting the dread oomkiller). v2: Extend unshrinkable protection for perma-pinned scratch and guc allocations (Tvrtko) v3: Notice that we should be pinned when marking unshrinkable and so the link cannot be empty; merge duplicate paths. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 11 +--- drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 ++ drivers/gpu/drm/i915/gem/i915_gem_pages.c | 13 +---- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 58 ++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_context.c | 4 +- drivers/gpu/drm/i915/gt/intel_gt.c | 3 +- drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 17 +++--- drivers/gpu/drm/i915/gt/uc/intel_guc.c | 2 +- drivers/gpu/drm/i915/i915_debugfs.c | 3 +- drivers/gpu/drm/i915/i915_vma.c | 16 ++++++ drivers/gpu/drm/i915/i915_vma.h | 4 ++ 11 files changed, 102 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index d5197a2a106f..4ea97fca9c35 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -63,6 +63,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, spin_lock_init(&obj->vma.lock); INIT_LIST_HEAD(&obj->vma.list); + INIT_LIST_HEAD(&obj->mm.link); + INIT_LIST_HEAD(&obj->lut_list); INIT_LIST_HEAD(&obj->batch_pool_link); @@ -273,14 +275,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) * or else we may oom whilst there are plenty of deferred * freed objects. */ - if (i915_gem_object_has_pages(obj) && - i915_gem_object_is_shrinkable(obj)) { - unsigned long flags; - - spin_lock_irqsave(&i915->mm.obj_lock, flags); - list_del_init(&obj->mm.link); - spin_unlock_irqrestore(&i915->mm.obj_lock, flags); - } + i915_gem_object_make_unshrinkable(obj); /* * Since we require blocking on struct_mutex to unbind the freed diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 67aea07ea019..3714cf234d64 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -394,6 +394,10 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, unsigned int flags); void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); +void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj); +void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); +void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj); + static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) { if (obj->cache_dirty) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index b36ad269f4ea..92ad3cc220e3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -153,24 +153,13 @@ static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj) struct sg_table * __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); struct sg_table *pages; pages = fetch_and_zero(&obj->mm.pages); if (IS_ERR_OR_NULL(pages)) return pages; - if (i915_gem_object_is_shrinkable(obj)) { - unsigned long flags; - - spin_lock_irqsave(&i915->mm.obj_lock, flags); - - list_del(&obj->mm.link); - i915->mm.shrink_count--; - i915->mm.shrink_memory -= obj->base.size; - - spin_unlock_irqrestore(&i915->mm.obj_lock, flags); - } + i915_gem_object_make_unshrinkable(obj); if (obj->mm.mapping) { void *ptr; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index b186bb5bfb44..b5506ca14383 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -530,3 +530,61 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, if (unlock) mutex_release(&i915->drm.struct_mutex.dep_map, 0, _RET_IP_); } + +#define obj_to_i915(obj__) to_i915((obj__)->base.dev) + +void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) +{ + /* + * We can only be called while the pages are pinned or when + * the pages are released. If pinned, we should only be called + * from a single caller under controlled conditions; and on release + * only one caller may release us. Neither the two may cross. + */ + if (!list_empty(&obj->mm.link)) { /* pinned by caller */ + struct drm_i915_private *i915 = obj_to_i915(obj); + unsigned long flags; + + spin_lock_irqsave(&i915->mm.obj_lock, flags); + GEM_BUG_ON(list_empty(&obj->mm.link)); + + list_del_init(&obj->mm.link); + i915->mm.shrink_count--; + i915->mm.shrink_memory -= obj->base.size; + + spin_unlock_irqrestore(&i915->mm.obj_lock, flags); + } +} + +static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, + struct list_head *head) +{ + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + GEM_BUG_ON(!list_empty(&obj->mm.link)); + + if (i915_gem_object_is_shrinkable(obj)) { + struct drm_i915_private *i915 = obj_to_i915(obj); + unsigned long flags; + + spin_lock_irqsave(&i915->mm.obj_lock, flags); + GEM_BUG_ON(!kref_read(&obj->base.refcount)); + + list_add_tail(&obj->mm.link, head); + i915->mm.shrink_count++; + i915->mm.shrink_memory += obj->base.size; + + spin_unlock_irqrestore(&i915->mm.obj_lock, flags); + } +} + +void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) +{ + __i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.shrink_list); +} + +void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) +{ + __i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.purge_list); +} diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 8f645737cde0..eb46b7e3c1d0 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -118,7 +118,7 @@ static int __context_pin_state(struct i915_vma *vma) * And mark it as a globally pinned object to let the shrinker know * it cannot reclaim the object until we release it. */ - vma->obj->pin_global++; + i915_vma_make_unshrinkable(vma); vma->obj->mm.dirty = true; return 0; @@ -126,8 +126,8 @@ static int __context_pin_state(struct i915_vma *vma) static void __context_unpin_state(struct i915_vma *vma) { - vma->obj->pin_global--; __i915_vma_unpin(vma); + i915_vma_make_shrinkable(vma); } static void __intel_context_retire(struct i915_active *active) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index feaf916c9abd..117ca64149e1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -230,7 +230,8 @@ int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size) if (ret) goto err_unref; - gt->scratch = vma; + gt->scratch = i915_vma_make_unshrinkable(vma); + return 0; err_unref: diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index 38ec11ae6ed7..d8efb88f33f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1238,7 +1238,7 @@ int intel_ring_pin(struct intel_ring *ring) goto err_ring; } - vma->obj->pin_global++; + i915_vma_make_unshrinkable(vma); GEM_BUG_ON(ring->vaddr); ring->vaddr = addr; @@ -1267,6 +1267,8 @@ void intel_ring_reset(struct intel_ring *ring, u32 tail) void intel_ring_unpin(struct intel_ring *ring) { + struct i915_vma *vma = ring->vma; + if (!atomic_dec_and_test(&ring->pin_count)) return; @@ -1275,18 +1277,17 @@ void intel_ring_unpin(struct intel_ring *ring) /* Discard any unused bytes beyond that submitted to hw. */ intel_ring_reset(ring, ring->tail); - GEM_BUG_ON(!ring->vma); - i915_vma_unset_ggtt_write(ring->vma); - if (i915_vma_is_map_and_fenceable(ring->vma)) - i915_vma_unpin_iomap(ring->vma); + i915_vma_unset_ggtt_write(vma); + if (i915_vma_is_map_and_fenceable(vma)) + i915_vma_unpin_iomap(vma); else - i915_gem_object_unpin_map(ring->vma->obj); + i915_gem_object_unpin_map(vma->obj); GEM_BUG_ON(!ring->vaddr); ring->vaddr = NULL; - ring->vma->obj->pin_global--; - i915_vma_unpin(ring->vma); + i915_vma_unpin(vma); + i915_vma_make_purgeable(vma); intel_timeline_unpin(ring->timeline); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 13fbbffd05c7..ed64fd9be6a9 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -625,7 +625,7 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size) goto err; } - return vma; + return i915_vma_make_unshrinkable(vma); err: i915_gem_object_put(obj); diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 63d4a7842a23..223373885201 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -367,8 +367,9 @@ static int i915_gem_object_info(struct seq_file *m, void *data) struct drm_i915_private *i915 = node_to_i915(m->private); int ret; - seq_printf(m, "%u shrinkable objects, %llu bytes\n", + seq_printf(m, "%u shrinkable [%u free] objects, %llu bytes\n", i915->mm.shrink_count, + atomic_read(&i915->mm.free_count), i915->mm.shrink_memory); seq_putc(m, '\n'); diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index eb16a1a93bbc..b52f71e0ade6 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1030,6 +1030,22 @@ int i915_vma_unbind(struct i915_vma *vma) return 0; } +struct i915_vma *i915_vma_make_unshrinkable(struct i915_vma *vma) +{ + i915_gem_object_make_unshrinkable(vma->obj); + return vma; +} + +void i915_vma_make_shrinkable(struct i915_vma *vma) +{ + i915_gem_object_make_shrinkable(vma->obj); +} + +void i915_vma_make_purgeable(struct i915_vma *vma) +{ + i915_gem_object_make_purgeable(vma->obj); +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/i915_vma.c" #endif diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 4b769db649bf..5c4224749bde 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -459,4 +459,8 @@ void i915_vma_parked(struct drm_i915_private *i915); struct i915_vma *i915_vma_alloc(void); void i915_vma_free(struct i915_vma *vma); +struct i915_vma *i915_vma_make_unshrinkable(struct i915_vma *vma); +void i915_vma_make_shrinkable(struct i915_vma *vma); +void i915_vma_make_purgeable(struct i915_vma *vma); + #endif From patchwork Fri Jul 26 08:45:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5DDF13AC for ; Fri, 26 Jul 2019 08:48:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5135205FC for ; Fri, 26 Jul 2019 08:48:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 999AD28A3B; Fri, 26 Jul 2019 08:48:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 918FA205FC for ; Fri, 26 Jul 2019 08:48:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9548E6ECBC; Fri, 26 Jul 2019 08:48:06 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2E7886E8B0 for ; Fri, 26 Jul 2019 08:47:59 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621085-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:58 +0100 Message-Id: <20190726084613.22129-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/27] drm/i915/gt: Move the [class][inst] lookup for engines onto the GT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Reviewed-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 +- drivers/gpu/drm/i915/gt/intel_engine.h | 3 - drivers/gpu/drm/i915/gt/intel_engine_cs.c | 53 +++++----------- drivers/gpu/drm/i915/gt/intel_engine_types.h | 9 ++- drivers/gpu/drm/i915/gt/intel_engine_user.c | 66 ++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_user.h | 20 ++++++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 4 ++ drivers/gpu/drm/i915/gt/selftest_lrc.c | 15 +++-- drivers/gpu/drm/i915/i915_drv.h | 7 ++- drivers/gpu/drm/i915/i915_irq.c | 2 +- drivers/gpu/drm/i915/i915_pmu.c | 3 +- drivers/gpu/drm/i915/i915_query.c | 2 +- drivers/gpu/drm/i915/i915_trace.h | 10 +-- 14 files changed, 138 insertions(+), 60 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_user.c create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_user.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 18201852d68d..d3fb2e3781af 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -75,6 +75,7 @@ gt-y += \ gt/intel_engine_cs.o \ gt/intel_engine_heartbeat.o \ gt/intel_engine_pm.o \ + gt/intel_engine_user.o \ gt/intel_gt.o \ gt/intel_gt_pm.o \ gt/intel_lrc.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 18b226bc5e3a..e31431fa141e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -70,6 +70,7 @@ #include #include "gt/intel_lrc_reg.h" +#include "gt/intel_engine_user.h" #include "i915_gem_context.h" #include "i915_globals.h" @@ -1740,7 +1741,7 @@ get_engines(struct i915_gem_context *ctx, if (e->engines[n]) { ci.engine_class = e->engines[n]->engine->uabi_class; - ci.engine_instance = e->engines[n]->engine->instance; + ci.engine_instance = e->engines[n]->engine->uabi_instance; } if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) { diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 38eeb4320c97..f0ad33929987 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -400,9 +400,6 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m, const char *header, ...); -struct intel_engine_cs * -intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance); - static inline void intel_engine_context_in(struct intel_engine_cs *engine) { unsigned long flags; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 89f4a3825b77..5935702478ca 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -32,6 +32,7 @@ #include "intel_engine.h" #include "intel_engine_pm.h" +#include "intel_engine_user.h" #include "intel_context.h" #include "intel_lrc.h" #include "intel_reset.h" @@ -285,9 +286,7 @@ static void intel_engine_sanitize_mmio(struct intel_engine_cs *engine) intel_engine_set_hwsp_writemask(engine, ~0u); } -static int -intel_engine_setup(struct drm_i915_private *dev_priv, - enum intel_engine_id id) +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) { const struct engine_info *info = &intel_engines[id]; struct intel_engine_cs *engine; @@ -303,10 +302,9 @@ intel_engine_setup(struct drm_i915_private *dev_priv, if (GEM_DEBUG_WARN_ON(info->instance > MAX_ENGINE_INSTANCE)) return -EINVAL; - if (GEM_DEBUG_WARN_ON(dev_priv->engine_class[info->class][info->instance])) + if (GEM_DEBUG_WARN_ON(gt->engine_class[info->class][info->instance])) return -EINVAL; - GEM_BUG_ON(dev_priv->engine[id]); engine = kzalloc(sizeof(*engine), GFP_KERNEL); if (!engine) return -ENOMEM; @@ -315,12 +313,12 @@ intel_engine_setup(struct drm_i915_private *dev_priv, engine->id = id; engine->mask = BIT(id); - engine->i915 = dev_priv; - engine->gt = &dev_priv->gt; - engine->uncore = &dev_priv->uncore; + engine->i915 = gt->i915; + engine->gt = gt; + engine->uncore = gt->uncore; __sprint_engine_name(engine->name, info); engine->hw_id = engine->guc_id = info->hw_id; - engine->mmio_base = __engine_mmio_base(dev_priv, info->mmio_bases); + engine->mmio_base = __engine_mmio_base(gt->i915, info->mmio_bases); engine->class = info->class; engine->instance = info->instance; @@ -331,13 +329,14 @@ intel_engine_setup(struct drm_i915_private *dev_priv, engine->destroy = (typeof(engine->destroy))kfree; engine->uabi_class = intel_engine_classes[info->class].uabi_class; + engine->uabi_instance = info->instance; - engine->context_size = intel_engine_context_size(dev_priv, + engine->context_size = intel_engine_context_size(gt->i915, engine->class); if (WARN_ON(engine->context_size > BIT(20))) engine->context_size = 0; if (engine->context_size) - DRIVER_CAPS(dev_priv)->has_logical_contexts = true; + DRIVER_CAPS(gt->i915)->has_logical_contexts = true; /* Nothing to do here, execute in order of dependencies */ engine->schedule = NULL; @@ -349,8 +348,11 @@ intel_engine_setup(struct drm_i915_private *dev_priv, /* Scrub mmio state on takeover */ intel_engine_sanitize_mmio(engine); - dev_priv->engine_class[info->class][info->instance] = engine; - dev_priv->engine[id] = engine; + engine->gt->engine_class[info->class][info->instance] = engine; + + intel_engine_add_user(engine); + gt->i915->engine[id] = engine; + return 0; } @@ -433,7 +435,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915) if (!HAS_ENGINE(i915, i)) continue; - err = intel_engine_setup(i915, i); + err = intel_engine_setup(&i915->gt, i); if (err) goto cleanup; @@ -1505,29 +1507,6 @@ void intel_engine_dump(struct intel_engine_cs *engine, intel_engine_print_breadcrumbs(engine, m); } -static u8 user_class_map[] = { - [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS, - [I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS, - [I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS, - [I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS, -}; - -struct intel_engine_cs * -intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance) -{ - if (class >= ARRAY_SIZE(user_class_map)) - return NULL; - - class = user_class_map[class]; - - GEM_BUG_ON(class > MAX_ENGINE_CLASS); - - if (instance > MAX_ENGINE_INSTANCE) - return NULL; - - return i915->engine_class[class][instance]; -} - /** * intel_enable_engine_stats() - Enable engine busy tracking on engine * @engine: engine to enable stats collection diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 9a3ead556743..0ef07ada0ae7 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -260,15 +261,19 @@ struct intel_engine_cs { unsigned int guc_id; intel_engine_mask_t mask; - u8 uabi_class; - u8 class; u8 instance; + + u8 uabi_class; + u8 uabi_instance; + u32 context_size; u32 mmio_base; u32 uabi_capabilities; + struct rb_node uabi_node; + struct intel_sseu sseu; struct intel_ring *buffer; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c new file mode 100644 index 000000000000..f74fb4d2fa0d --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c @@ -0,0 +1,66 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "intel_engine.h" +#include "intel_engine_user.h" + +struct intel_engine_cs * +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance) +{ + struct rb_node *p = i915->uabi_engines.rb_node; + + while (p) { + struct intel_engine_cs *it = + rb_entry(p, typeof(*it), uabi_node); + + if (class < it->uabi_class) + p = p->rb_left; + else if (class > it->uabi_class || + instance > it->uabi_instance) + p = p->rb_right; + else if (instance < it->uabi_instance) + p = p->rb_left; + else + return it; + } + + return NULL; +} + +void intel_engine_add_user(struct intel_engine_cs *engine) +{ + struct rb_root *root = &engine->i915->uabi_engines; + struct rb_node **p, *parent; + + parent = NULL; + p = &root->rb_node; + while (*p) { + struct intel_engine_cs *it; + + parent = *p; + it = rb_entry(parent, typeof(*it), uabi_node); + + /* All user class:instance identifiers must be unique */ + GEM_BUG_ON(it->uabi_class == engine->uabi_class && + it->uabi_instance == engine->uabi_instance); + + if (engine->uabi_class < it->uabi_class) + p = &parent->rb_left; + else if (engine->uabi_class > it->uabi_class || + engine->uabi_instance > it->uabi_instance) + p = &parent->rb_right; + else + p = &parent->rb_left; + } + + rb_link_node(&engine->uabi_node, parent, p); + rb_insert_color(&engine->uabi_node, root); + + GEM_BUG_ON(intel_engine_lookup_user(engine->i915, + engine->uabi_class, + engine->uabi_instance) != engine); +} diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.h b/drivers/gpu/drm/i915/gt/intel_engine_user.h new file mode 100644 index 000000000000..091dc8a4a39f --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.h @@ -0,0 +1,20 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_ENGINE_USER_H +#define INTEL_ENGINE_USER_H + +#include + +struct drm_i915_private; +struct intel_engine_cs; + +struct intel_engine_cs * +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance); + +void intel_engine_add_user(struct intel_engine_cs *engine); + +#endif /* INTEL_ENGINE_USER_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 0112bb98cbf3..5fa7bf793f0f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -21,6 +21,7 @@ struct drm_i915_private; struct i915_ggtt; +struct intel_engine_cs; struct intel_uncore; struct intel_gt { @@ -67,6 +68,9 @@ struct intel_gt { u32 pm_ier; u32 pm_guc_events; + + struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1] + [MAX_ENGINE_INSTANCE + 1]; }; enum intel_gt_scratch_field { diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 60f27e52d267..eb40a58665be 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -1773,6 +1773,7 @@ static int live_virtual_engine(void *arg) struct drm_i915_private *i915 = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; struct intel_engine_cs *engine; + struct intel_gt *gt = &i915->gt; enum intel_engine_id id; unsigned int class, inst; int err = -ENODEV; @@ -1796,10 +1797,10 @@ static int live_virtual_engine(void *arg) nsibling = 0; for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) { - if (!i915->engine_class[class][inst]) + if (!gt->engine_class[class][inst]) continue; - siblings[nsibling++] = i915->engine_class[class][inst]; + siblings[nsibling++] = gt->engine_class[class][inst]; } if (nsibling < 2) continue; @@ -1920,6 +1921,7 @@ static int live_virtual_mask(void *arg) { struct drm_i915_private *i915 = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; + struct intel_gt *gt = &i915->gt; unsigned int class, inst; int err = 0; @@ -1933,10 +1935,10 @@ static int live_virtual_mask(void *arg) nsibling = 0; for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) { - if (!i915->engine_class[class][inst]) + if (!gt->engine_class[class][inst]) break; - siblings[nsibling++] = i915->engine_class[class][inst]; + siblings[nsibling++] = gt->engine_class[class][inst]; } if (nsibling < 2) continue; @@ -2097,6 +2099,7 @@ static int live_virtual_bond(void *arg) }; struct drm_i915_private *i915 = arg; struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1]; + struct intel_gt *gt = &i915->gt; unsigned int class, inst; int err = 0; @@ -2111,11 +2114,11 @@ static int live_virtual_bond(void *arg) nsibling = 0; for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) { - if (!i915->engine_class[class][inst]) + if (!gt->engine_class[class][inst]) break; GEM_BUG_ON(nsibling == ARRAY_SIZE(siblings)); - siblings[nsibling++] = i915->engine_class[class][inst]; + siblings[nsibling++] = gt->engine_class[class][inst]; } if (nsibling < 2) continue; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ab548ec3da1c..58b1ef5f2ce0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1371,11 +1371,12 @@ struct drm_i915_private { wait_queue_head_t gmbus_wait_queue; struct pci_dev *bridge_dev; - struct intel_engine_cs *engine[I915_NUM_ENGINES]; + /* Context used internally to idle the GPU and setup initial state */ struct i915_gem_context *kernel_context; - struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1] - [MAX_ENGINE_INSTANCE + 1]; + + struct intel_engine_cs *engine[I915_NUM_ENGINES]; + struct rb_root uabi_engines; struct resource mch_res; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index a17d4fd17962..7b19d7df9ba1 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -3109,7 +3109,7 @@ gen11_engine_irq_handler(struct intel_gt *gt, const u8 class, struct intel_engine_cs *engine; if (instance <= MAX_ENGINE_INSTANCE) - engine = gt->i915->engine_class[class][instance]; + engine = gt->engine_class[class][instance]; else engine = NULL; diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index eff86483bec0..bdf7963a043b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -8,6 +8,7 @@ #include #include "gt/intel_engine.h" +#include "gt/intel_engine_user.h" #include "i915_drv.h" #include "i915_pmu.h" @@ -926,7 +927,7 @@ create_event_attributes(struct drm_i915_private *i915) i915_iter = add_i915_attr(i915_iter, str, __I915_PMU_ENGINE(engine->uabi_class, - engine->instance, + engine->uabi_instance, engine_events[i].sample)); str = kasprintf(GFP_KERNEL, "%s-%s.unit", diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index 7b7016171057..70b1ad38e615 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -127,7 +127,7 @@ query_engine_info(struct drm_i915_private *i915, for_each_engine(engine, i915, id) { info.engine.engine_class = engine->uabi_class; - info.engine.engine_instance = engine->instance; + info.engine.engine_instance = engine->uabi_instance; info.capabilities = engine->uabi_capabilities; if (__copy_to_user(info_ptr, &info, sizeof(info))) diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index da18b8d6b80c..1d11245c4c87 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -677,7 +677,7 @@ TRACE_EVENT(i915_request_queue, __entry->dev = rq->i915->drm.primary->index; __entry->hw_id = rq->gem_context->hw_id; __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; + __entry->instance = rq->engine->uabi_instance; __entry->ctx = rq->fence.context; __entry->seqno = rq->fence.seqno; __entry->flags = flags; @@ -706,7 +706,7 @@ DECLARE_EVENT_CLASS(i915_request, __entry->dev = rq->i915->drm.primary->index; __entry->hw_id = rq->gem_context->hw_id; __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; + __entry->instance = rq->engine->uabi_instance; __entry->ctx = rq->fence.context; __entry->seqno = rq->fence.seqno; ), @@ -751,7 +751,7 @@ TRACE_EVENT(i915_request_in, __entry->dev = rq->i915->drm.primary->index; __entry->hw_id = rq->gem_context->hw_id; __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; + __entry->instance = rq->engine->uabi_instance; __entry->ctx = rq->fence.context; __entry->seqno = rq->fence.seqno; __entry->prio = rq->sched.attr.priority; @@ -782,7 +782,7 @@ TRACE_EVENT(i915_request_out, __entry->dev = rq->i915->drm.primary->index; __entry->hw_id = rq->gem_context->hw_id; __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; + __entry->instance = rq->engine->uabi_instance; __entry->ctx = rq->fence.context; __entry->seqno = rq->fence.seqno; __entry->completed = i915_request_completed(rq); @@ -847,7 +847,7 @@ TRACE_EVENT(i915_request_wait_begin, __entry->dev = rq->i915->drm.primary->index; __entry->hw_id = rq->gem_context->hw_id; __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; + __entry->instance = rq->engine->uabi_instance; __entry->ctx = rq->fence.context; __entry->seqno = rq->fence.seqno; __entry->flags = flags; From patchwork Fri Jul 26 08:45:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060553 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 81F0413B1 for ; Fri, 26 Jul 2019 08:46:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 727EF28A52 for ; Fri, 26 Jul 2019 08:46:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6679728A5A; Fri, 26 Jul 2019 08:46:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C4DD728A59 for ; Fri, 26 Jul 2019 08:46:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C79F46E8B8; Fri, 26 Jul 2019 08:46:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 919A56E8AD for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621086-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:45:59 +0100 Message-Id: <20190726084613.22129-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/27] drm/i915: Introduce for_each_user_engine() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Now that we have a compact tree representation for uabi engines, make use of it for walking all user engines from catchall user interfaces like debugfs and capabilities. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 6 ++--- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +-- drivers/gpu/drm/i915/i915_debugfs.c | 23 +++++++------------ drivers/gpu/drm/i915/i915_drv.h | 8 +++++++ drivers/gpu/drm/i915/i915_query.c | 3 +-- 6 files changed, 21 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 5935702478ca..095ddcbd75bf 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -694,12 +694,11 @@ void intel_engines_set_scheduler_caps(struct drm_i915_private *i915) #undef MAP }; struct intel_engine_cs *engine; - enum intel_engine_id id; u32 enabled, disabled; enabled = 0; disabled = 0; - for_each_engine(engine, i915, id) { /* all engines must agree! */ + for_each_user_engine(engine, i915) { /* all engines must agree! */ int i; if (engine->schedule) @@ -1194,11 +1193,10 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine) unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915) { struct intel_engine_cs *engine; - enum intel_engine_id id; unsigned int which; which = 0; - for_each_engine(engine, i915, id) + for_each_user_engine(engine, i915) if (engine->default_state) which |= BIT(engine->uabi_class); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b4238fe16a03..d91e4967217e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -488,7 +488,7 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq) ring_tail, rq->fence.seqno); guc_ring_doorbell(client); - client->submissions[engine->id] += 1; + client->submissions[engine->guc_id] += 1; } /* diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index a28bcd2d7c09..730c1ed6d2a7 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1352,11 +1352,10 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv) { struct intel_engine_cs *engine; - enum intel_engine_id id; bool active = false; /* If the command parser is not enabled, report 0 - unsupported */ - for_each_engine(engine, dev_priv, id) { + for_each_user_engine(engine, dev_priv) { if (intel_engine_needs_cmd_parser(engine)) { active = true; break; diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 223373885201..a51a0c8e8a83 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -301,10 +301,9 @@ static void print_batch_pool_stats(struct seq_file *m, struct drm_i915_gem_object *obj; struct intel_engine_cs *engine; struct file_stats stats = {}; - enum intel_engine_id id; int j; - for_each_engine(engine, dev_priv, id) { + for_each_user_engine(engine, dev_priv) { for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { list_for_each_entry(obj, &engine->batch_pool.cache_list[j], @@ -391,7 +390,6 @@ static int i915_gem_batch_pool_info(struct seq_file *m, void *data) struct drm_device *dev = &dev_priv->drm; struct drm_i915_gem_object *obj; struct intel_engine_cs *engine; - enum intel_engine_id id; int total = 0; int ret, j; @@ -399,7 +397,7 @@ static int i915_gem_batch_pool_info(struct seq_file *m, void *data) if (ret) return ret; - for_each_engine(engine, dev_priv, id) { + for_each_user_engine(engine, dev_priv) { for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { int count; @@ -486,7 +484,6 @@ static int i915_interrupt_info(struct seq_file *m, void *data) { struct drm_i915_private *dev_priv = node_to_i915(m->private); struct intel_engine_cs *engine; - enum intel_engine_id id; intel_wakeref_t wakeref; int i, pipe; @@ -689,7 +686,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data) I915_READ(GEN11_GUNIT_CSME_INTR_MASK)); } else if (INTEL_GEN(dev_priv) >= 6) { - for_each_engine(engine, dev_priv, id) { + for_each_user_engine(engine, dev_priv) { seq_printf(m, "Graphics Interrupt mask (%s): %08x\n", engine->name, ENGINE_READ(engine, RING_IMR)); @@ -1879,7 +1876,6 @@ static void i915_guc_client_info(struct seq_file *m, struct intel_guc_client *client) { struct intel_engine_cs *engine; - enum intel_engine_id id; u64 tot = 0; seq_printf(m, "\tPriority %d, GuC stage index: %u, PD offset 0x%x\n", @@ -1887,8 +1883,8 @@ static void i915_guc_client_info(struct seq_file *m, seq_printf(m, "\tDoorbell id %d, offset: 0x%lx\n", client->doorbell_id, client->doorbell_offset); - for_each_engine(engine, dev_priv, id) { - u64 submissions = client->submissions[id]; + for_each_user_engine(engine, dev_priv) { + u64 submissions = client->submissions[engine->guc_id]; tot += submissions; seq_printf(m, "\tSubmissions: %llu %s\n", submissions, engine->name); @@ -1928,7 +1924,6 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data) struct drm_i915_private *dev_priv = node_to_i915(m->private); const struct intel_guc *guc = &dev_priv->gt.uc.guc; struct guc_stage_desc *desc = guc->stage_desc_pool_vaddr; - intel_engine_mask_t tmp; int index; if (!USES_GUC_SUBMISSION(dev_priv)) @@ -1957,7 +1952,7 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data) desc->wq_addr, desc->wq_size); seq_putc(m, '\n'); - for_each_engine(engine, dev_priv, tmp) { + for_each_user_engine(engine, dev_priv) { u32 guc_engine_id = engine->guc_id; struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id]; @@ -2790,7 +2785,6 @@ static int i915_engine_info(struct seq_file *m, void *unused) struct drm_i915_private *dev_priv = node_to_i915(m->private); struct intel_engine_cs *engine; intel_wakeref_t wakeref; - enum intel_engine_id id; struct drm_printer p; wakeref = intel_runtime_pm_get(&dev_priv->runtime_pm); @@ -2802,7 +2796,7 @@ static int i915_engine_info(struct seq_file *m, void *unused) RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz); p = drm_seq_file_printer(m); - for_each_engine(engine, dev_priv, id) + for_each_user_engine(engine, dev_priv) intel_engine_dump(engine, &p, "%s\n", engine->name); intel_runtime_pm_put(&dev_priv->runtime_pm, wakeref); @@ -2883,9 +2877,8 @@ static int i915_wa_registers(struct seq_file *m, void *unused) { struct drm_i915_private *i915 = node_to_i915(m->private); struct intel_engine_cs *engine; - enum intel_engine_id id; - for_each_engine(engine, i915, id) { + for_each_user_engine(engine, i915) { const struct i915_wa_list *wal = &engine->ctx_wa_list; const struct i915_wa *wa; unsigned int count; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 58b1ef5f2ce0..76b90d7be231 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1915,6 +1915,14 @@ static inline struct drm_i915_private *wopcm_to_i915(struct intel_wopcm *wopcm) ((engine__) = (dev_priv__)->engine[__mask_next_bit(tmp__)]), 1 : \ 0;) +#define rb_to_uabi_engine(rb) \ + rb_entry_safe(rb, struct intel_engine_cs, uabi_node) + +#define for_each_user_engine(engine__, i915__) \ + for ((engine__) = rb_to_uabi_engine(rb_first(&(i915__)->uabi_engines));\ + (engine__); \ + (engine__) = rb_to_uabi_engine(rb_next(&(engine__)->uabi_node))) + enum hdmi_force_audio { HDMI_AUDIO_OFF_DVI = -2, /* no aux data for HDMI-DVI converter */ HDMI_AUDIO_OFF, /* force turn off HDMI audio */ diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index 70b1ad38e615..8abba3a31767 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -105,7 +105,6 @@ query_engine_info(struct drm_i915_private *i915, struct drm_i915_query_engine_info query; struct drm_i915_engine_info info = { }; struct intel_engine_cs *engine; - enum intel_engine_id id; int len, ret; if (query_item->flags) @@ -125,7 +124,7 @@ query_engine_info(struct drm_i915_private *i915, info_ptr = &query_ptr->engines[0]; - for_each_engine(engine, i915, id) { + for_each_user_engine(engine, i915) { info.engine.engine_class = engine->uabi_class; info.engine.engine_instance = engine->uabi_instance; info.capabilities = engine->uabi_capabilities; From patchwork Fri Jul 26 08:46:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060559 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA80413A0 for ; Fri, 26 Jul 2019 08:46:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB3DA28894 for ; Fri, 26 Jul 2019 08:46:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AFEFA28A3D; Fri, 26 Jul 2019 08:46:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6EDF928894 for ; Fri, 26 Jul 2019 08:46:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 92B0D6E8B3; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 604CB6E8A9 for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621089-1500050 for multiple; Fri, 26 Jul 2019 09:46:16 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:00 +0100 Message-Id: <20190726084613.22129-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/27] drm/i915: Use intel_engine_lookup_user for probing HAS_BSD etc X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Use the same mechanism to determine if a backend engine exists for a uabi mapping as used internally. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index d960672ae550..036ef4bc4fec 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -61,6 +61,7 @@ #include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" +#include "gt/intel_engine_user.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" #include "gt/intel_reset.h" @@ -371,16 +372,20 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data, value = dev_priv->overlay ? 1 : 0; break; case I915_PARAM_HAS_BSD: - value = !!dev_priv->engine[VCS0]; + value = !!intel_engine_lookup_user(dev_priv, + I915_ENGINE_CLASS_VIDEO, 0); break; case I915_PARAM_HAS_BLT: - value = !!dev_priv->engine[BCS0]; + value = !!intel_engine_lookup_user(dev_priv, + I915_ENGINE_CLASS_COPY, 0); break; case I915_PARAM_HAS_VEBOX: - value = !!dev_priv->engine[VECS0]; + value = !!intel_engine_lookup_user(dev_priv, + I915_ENGINE_CLASS_VIDEO_ENHANCE, 0); break; case I915_PARAM_HAS_BSD2: - value = !!dev_priv->engine[VCS1]; + value = !!intel_engine_lookup_user(dev_priv, + I915_ENGINE_CLASS_VIDEO, 1); break; case I915_PARAM_HAS_LLC: value = HAS_LLC(dev_priv); From patchwork Fri Jul 26 08:46:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060555 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71E1413B1 for ; Fri, 26 Jul 2019 08:46:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6224528894 for ; Fri, 26 Jul 2019 08:46:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 56D3128A59; Fri, 26 Jul 2019 08:46:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 97B7B28A52 for ; Fri, 26 Jul 2019 08:46:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 325026E8BA; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D9C2F6E8AC for ; Fri, 26 Jul 2019 08:46:27 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621090-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:01 +0100 Message-Id: <20190726084613.22129-15-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/27] drm/i915: Isolate i915_getparam_ioctl() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP This giant switch has tendrils all other the struct and does not fit into the rest of the driver bring up and control of i915_drv.c. Push it to one side so that it can grow in peace. Signed-off-by: Chris Wilson Acked-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_drv.c | 168 --------------------------- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_getparam.c | 167 ++++++++++++++++++++++++++ 4 files changed, 171 insertions(+), 168 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_getparam.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index d3fb2e3781af..4d68529a270e 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -42,6 +42,7 @@ subdir-ccflags-y += -I$(srctree)/$(src) # core driver code i915-y += i915_drv.o \ i915_irq.o \ + i915_getparam.o \ i915_params.o \ i915_pci.o \ i915_scatterlist.o \ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 036ef4bc4fec..733f9e9d24aa 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -61,22 +61,15 @@ #include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" -#include "gt/intel_engine_user.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" -#include "gt/intel_reset.h" -#include "gt/intel_workarounds.h" -#include "gt/uc/intel_uc.h" #include "i915_debugfs.h" #include "i915_drv.h" #include "i915_irq.h" -#include "i915_pmu.h" #include "i915_query.h" -#include "i915_trace.h" #include "i915_vgpu.h" #include "intel_csr.h" -#include "intel_drv.h" #include "intel_pm.h" static struct drm_driver driver; @@ -343,167 +336,6 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv) pci_dev_put(pch); } -static int i915_getparam_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - struct drm_i915_private *dev_priv = to_i915(dev); - struct pci_dev *pdev = dev_priv->drm.pdev; - const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu; - drm_i915_getparam_t *param = data; - int value; - - switch (param->param) { - case I915_PARAM_IRQ_ACTIVE: - case I915_PARAM_ALLOW_BATCHBUFFER: - case I915_PARAM_LAST_DISPATCH: - case I915_PARAM_HAS_EXEC_CONSTANTS: - /* Reject all old ums/dri params. */ - return -ENODEV; - case I915_PARAM_CHIPSET_ID: - value = pdev->device; - break; - case I915_PARAM_REVISION: - value = pdev->revision; - break; - case I915_PARAM_NUM_FENCES_AVAIL: - value = dev_priv->ggtt.num_fences; - break; - case I915_PARAM_HAS_OVERLAY: - value = dev_priv->overlay ? 1 : 0; - break; - case I915_PARAM_HAS_BSD: - value = !!intel_engine_lookup_user(dev_priv, - I915_ENGINE_CLASS_VIDEO, 0); - break; - case I915_PARAM_HAS_BLT: - value = !!intel_engine_lookup_user(dev_priv, - I915_ENGINE_CLASS_COPY, 0); - break; - case I915_PARAM_HAS_VEBOX: - value = !!intel_engine_lookup_user(dev_priv, - I915_ENGINE_CLASS_VIDEO_ENHANCE, 0); - break; - case I915_PARAM_HAS_BSD2: - value = !!intel_engine_lookup_user(dev_priv, - I915_ENGINE_CLASS_VIDEO, 1); - break; - case I915_PARAM_HAS_LLC: - value = HAS_LLC(dev_priv); - break; - case I915_PARAM_HAS_WT: - value = HAS_WT(dev_priv); - break; - case I915_PARAM_HAS_ALIASING_PPGTT: - value = INTEL_PPGTT(dev_priv); - break; - case I915_PARAM_HAS_SEMAPHORES: - value = !!(dev_priv->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES); - break; - case I915_PARAM_HAS_SECURE_BATCHES: - value = capable(CAP_SYS_ADMIN); - break; - case I915_PARAM_CMD_PARSER_VERSION: - value = i915_cmd_parser_get_version(dev_priv); - break; - case I915_PARAM_SUBSLICE_TOTAL: - value = intel_sseu_subslice_total(sseu); - if (!value) - return -ENODEV; - break; - case I915_PARAM_EU_TOTAL: - value = sseu->eu_total; - if (!value) - return -ENODEV; - break; - case I915_PARAM_HAS_GPU_RESET: - value = intel_has_gpu_reset(dev_priv); - if (value && intel_has_reset_engine(dev_priv)) - value = 2; - break; - case I915_PARAM_HAS_RESOURCE_STREAMER: - value = 0; - break; - case I915_PARAM_HAS_POOLED_EU: - value = HAS_POOLED_EU(dev_priv); - break; - case I915_PARAM_MIN_EU_IN_POOL: - value = sseu->min_eu_in_pool; - break; - case I915_PARAM_HUC_STATUS: - value = intel_huc_check_status(&dev_priv->gt.uc.huc); - if (value < 0) - return value; - break; - case I915_PARAM_MMAP_GTT_VERSION: - /* Though we've started our numbering from 1, and so class all - * earlier versions as 0, in effect their value is undefined as - * the ioctl will report EINVAL for the unknown param! - */ - value = i915_gem_mmap_gtt_version(); - break; - case I915_PARAM_HAS_SCHEDULER: - value = dev_priv->caps.scheduler; - break; - - case I915_PARAM_MMAP_VERSION: - /* Remember to bump this if the version changes! */ - case I915_PARAM_HAS_GEM: - case I915_PARAM_HAS_PAGEFLIPPING: - case I915_PARAM_HAS_EXECBUF2: /* depends on GEM */ - case I915_PARAM_HAS_RELAXED_FENCING: - case I915_PARAM_HAS_COHERENT_RINGS: - case I915_PARAM_HAS_RELAXED_DELTA: - case I915_PARAM_HAS_GEN7_SOL_RESET: - case I915_PARAM_HAS_WAIT_TIMEOUT: - case I915_PARAM_HAS_PRIME_VMAP_FLUSH: - case I915_PARAM_HAS_PINNED_BATCHES: - case I915_PARAM_HAS_EXEC_NO_RELOC: - case I915_PARAM_HAS_EXEC_HANDLE_LUT: - case I915_PARAM_HAS_COHERENT_PHYS_GTT: - case I915_PARAM_HAS_EXEC_SOFTPIN: - case I915_PARAM_HAS_EXEC_ASYNC: - case I915_PARAM_HAS_EXEC_FENCE: - case I915_PARAM_HAS_EXEC_CAPTURE: - case I915_PARAM_HAS_EXEC_BATCH_FIRST: - case I915_PARAM_HAS_EXEC_FENCE_ARRAY: - case I915_PARAM_HAS_EXEC_SUBMIT_FENCE: - /* For the time being all of these are always true; - * if some supported hardware does not have one of these - * features this value needs to be provided from - * INTEL_INFO(), a feature macro, or similar. - */ - value = 1; - break; - case I915_PARAM_HAS_CONTEXT_ISOLATION: - value = intel_engines_has_context_isolation(dev_priv); - break; - case I915_PARAM_SLICE_MASK: - value = sseu->slice_mask; - if (!value) - return -ENODEV; - break; - case I915_PARAM_SUBSLICE_MASK: - value = sseu->subslice_mask[0]; - if (!value) - return -ENODEV; - break; - case I915_PARAM_CS_TIMESTAMP_FREQUENCY: - value = 1000 * RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz; - break; - case I915_PARAM_MMAP_GTT_COHERENT: - value = INTEL_INFO(dev_priv)->has_coherent_ggtt; - break; - default: - DRM_DEBUG("Unknown parameter %d\n", param->param); - return -EINVAL; - } - - if (put_user(value, param->value)) - return -EFAULT; - - return 0; -} - static int i915_get_bridge_dev(struct drm_i915_private *dev_priv) { int domain = pci_domain_nr(dev_priv->drm.pdev->bus); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 76b90d7be231..18e110f9df92 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2393,6 +2393,9 @@ static inline bool intel_vgpu_active(struct drm_i915_private *dev_priv) return dev_priv->vgpu.active; } +int i915_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); + /* i915_gem.c */ int i915_gem_init_userptr(struct drm_i915_private *dev_priv); void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c new file mode 100644 index 000000000000..e6c351080593 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -0,0 +1,167 @@ +/* + * SPDX-License-Identifier: MIT + */ + +#include "gt/intel_engine_user.h" + +#include "i915_drv.h" + +int i915_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_i915_private *i915 = to_i915(dev); + const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu; + drm_i915_getparam_t *param = data; + int value; + + switch (param->param) { + case I915_PARAM_IRQ_ACTIVE: + case I915_PARAM_ALLOW_BATCHBUFFER: + case I915_PARAM_LAST_DISPATCH: + case I915_PARAM_HAS_EXEC_CONSTANTS: + /* Reject all old ums/dri params. */ + return -ENODEV; + case I915_PARAM_CHIPSET_ID: + value = i915->drm.pdev->device; + break; + case I915_PARAM_REVISION: + value = i915->drm.pdev->revision; + break; + case I915_PARAM_NUM_FENCES_AVAIL: + value = i915->ggtt.num_fences; + break; + case I915_PARAM_HAS_OVERLAY: + value = !!i915->overlay; + break; + case I915_PARAM_HAS_BSD: + value = !!intel_engine_lookup_user(i915, + I915_ENGINE_CLASS_VIDEO, 0); + break; + case I915_PARAM_HAS_BLT: + value = !!intel_engine_lookup_user(i915, + I915_ENGINE_CLASS_COPY, 0); + break; + case I915_PARAM_HAS_VEBOX: + value = !!intel_engine_lookup_user(i915, + I915_ENGINE_CLASS_VIDEO_ENHANCE, 0); + break; + case I915_PARAM_HAS_BSD2: + value = !!intel_engine_lookup_user(i915, + I915_ENGINE_CLASS_VIDEO, 1); + break; + case I915_PARAM_HAS_LLC: + value = HAS_LLC(i915); + break; + case I915_PARAM_HAS_WT: + value = HAS_WT(i915); + break; + case I915_PARAM_HAS_ALIASING_PPGTT: + value = INTEL_PPGTT(i915); + break; + case I915_PARAM_HAS_SEMAPHORES: + value = !!(i915->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES); + break; + case I915_PARAM_HAS_SECURE_BATCHES: + value = capable(CAP_SYS_ADMIN); + break; + case I915_PARAM_CMD_PARSER_VERSION: + value = i915_cmd_parser_get_version(i915); + break; + case I915_PARAM_SUBSLICE_TOTAL: + value = intel_sseu_subslice_total(sseu); + if (!value) + return -ENODEV; + break; + case I915_PARAM_EU_TOTAL: + value = sseu->eu_total; + if (!value) + return -ENODEV; + break; + case I915_PARAM_HAS_GPU_RESET: + value = intel_has_gpu_reset(i915); + if (value && intel_has_reset_engine(i915)) + value = 2; + break; + case I915_PARAM_HAS_RESOURCE_STREAMER: + value = 0; + break; + case I915_PARAM_HAS_POOLED_EU: + value = HAS_POOLED_EU(i915); + break; + case I915_PARAM_MIN_EU_IN_POOL: + value = sseu->min_eu_in_pool; + break; + case I915_PARAM_HUC_STATUS: + value = intel_huc_check_status(&i915->gt.uc.huc); + if (value < 0) + return value; + break; + case I915_PARAM_MMAP_GTT_VERSION: + /* Though we've started our numbering from 1, and so class all + * earlier versions as 0, in effect their value is undefined as + * the ioctl will report EINVAL for the unknown param! + */ + value = i915_gem_mmap_gtt_version(); + break; + case I915_PARAM_HAS_SCHEDULER: + value = i915->caps.scheduler; + break; + + case I915_PARAM_MMAP_VERSION: + /* Remember to bump this if the version changes! */ + case I915_PARAM_HAS_GEM: + case I915_PARAM_HAS_PAGEFLIPPING: + case I915_PARAM_HAS_EXECBUF2: /* depends on GEM */ + case I915_PARAM_HAS_RELAXED_FENCING: + case I915_PARAM_HAS_COHERENT_RINGS: + case I915_PARAM_HAS_RELAXED_DELTA: + case I915_PARAM_HAS_GEN7_SOL_RESET: + case I915_PARAM_HAS_WAIT_TIMEOUT: + case I915_PARAM_HAS_PRIME_VMAP_FLUSH: + case I915_PARAM_HAS_PINNED_BATCHES: + case I915_PARAM_HAS_EXEC_NO_RELOC: + case I915_PARAM_HAS_EXEC_HANDLE_LUT: + case I915_PARAM_HAS_COHERENT_PHYS_GTT: + case I915_PARAM_HAS_EXEC_SOFTPIN: + case I915_PARAM_HAS_EXEC_ASYNC: + case I915_PARAM_HAS_EXEC_FENCE: + case I915_PARAM_HAS_EXEC_CAPTURE: + case I915_PARAM_HAS_EXEC_BATCH_FIRST: + case I915_PARAM_HAS_EXEC_FENCE_ARRAY: + case I915_PARAM_HAS_EXEC_SUBMIT_FENCE: + /* For the time being all of these are always true; + * if some supported hardware does not have one of these + * features this value needs to be provided from + * INTEL_INFO(), a feature macro, or similar. + */ + value = 1; + break; + case I915_PARAM_HAS_CONTEXT_ISOLATION: + value = intel_engines_has_context_isolation(i915); + break; + case I915_PARAM_SLICE_MASK: + value = sseu->slice_mask; + if (!value) + return -ENODEV; + break; + case I915_PARAM_SUBSLICE_MASK: + value = sseu->subslice_mask[0]; + if (!value) + return -ENODEV; + break; + case I915_PARAM_CS_TIMESTAMP_FREQUENCY: + value = 1000 * RUNTIME_INFO(i915)->cs_timestamp_frequency_khz; + break; + case I915_PARAM_MMAP_GTT_COHERENT: + value = INTEL_INFO(i915)->has_coherent_ggtt; + break; + default: + DRM_DEBUG("Unknown parameter %d\n", param->param); + return -EINVAL; + } + + if (put_user(value, param->value)) + return -EFAULT; + + return 0; +} From patchwork Fri Jul 26 08:46:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 827761398 for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7371F205FC for ; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6740028A52; Fri, 26 Jul 2019 08:46:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 97261205FC for ; Fri, 26 Jul 2019 08:46:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 946B36E8AC; Fri, 26 Jul 2019 08:46:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3575E6E8A7 for ; Fri, 26 Jul 2019 08:46:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621091-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:02 +0100 Message-Id: <20190726084613.22129-16-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 16/27] drm/i915: Only include active engines in the capture state X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Skip printing out idle engines that did not contribute to the GPU hang. As the number of engines gets ever larger, we have increasing noise in the error state where typically there is only one guilty request on one engine that we need to inspect. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gpu_error.c | 234 +++++++++++--------------- drivers/gpu/drm/i915/i915_gpu_error.h | 7 +- 2 files changed, 102 insertions(+), 139 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eff10ca5a788..178100b0d05c 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -49,27 +49,6 @@ #define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN) #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN) -static inline const struct intel_engine_cs * -engine_lookup(const struct drm_i915_private *i915, unsigned int id) -{ - if (id >= I915_NUM_ENGINES) - return NULL; - - return i915->engine[id]; -} - -static inline const char * -__engine_name(const struct intel_engine_cs *engine) -{ - return engine ? engine->name : ""; -} - -static const char * -engine_name(const struct drm_i915_private *i915, unsigned int id) -{ - return __engine_name(engine_lookup(i915, id)); -} - static void __sg_set_buf(struct scatterlist *sg, void *addr, unsigned int len, loff_t it) { @@ -447,7 +426,7 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone.instdone); - if (ee->engine_id != RCS0 || INTEL_GEN(m->i915) <= 3) + if (ee->engine->class != RENDER_CLASS || INTEL_GEN(m->i915) <= 3) return; err_printf(m, " SC_INSTDONE: 0x%08x\n", @@ -501,8 +480,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, { int n; - err_printf(m, "%s command stream:\n", - engine_name(m->i915, ee->engine_id)); + err_printf(m, "%s command stream:\n", ee->engine->name); err_printf(m, " IDLE?: %s\n", yesno(ee->idle)); err_printf(m, " START: 0x%08x\n", ee->start); err_printf(m, " HEAD: 0x%08x [0x%08x]\n", ee->head, ee->rq_head); @@ -574,9 +552,9 @@ void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...) } static void print_error_obj(struct drm_i915_error_state_buf *m, - struct intel_engine_cs *engine, + const struct intel_engine_cs *engine, const char *name, - struct drm_i915_error_object *obj) + const struct drm_i915_error_object *obj) { char out[ASCII85_BUFSZ]; int page; @@ -673,7 +651,7 @@ static void err_free_sgl(struct scatterlist *sgl) static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, struct i915_gpu_state *error) { - struct drm_i915_error_object *obj; + const struct drm_i915_error_engine *ee; struct timespec64 ts; int i, j; @@ -694,15 +672,12 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, err_printf(m, "Capture: %lu jiffies; %d ms ago\n", error->capture, jiffies_to_msecs(jiffies - error->capture)); - for (i = 0; i < ARRAY_SIZE(error->engine); i++) { - if (!error->engine[i].context.pid) - continue; - + for (ee = error->engine; ee; ee = ee->next) err_printf(m, "Active process (on ring %s): %s [%d]\n", - engine_name(m->i915, i), - error->engine[i].context.comm, - error->engine[i].context.pid); - } + ee->engine->name, + ee->context.comm, + ee->context.pid); + err_printf(m, "Reset count: %u\n", error->reset_count); err_printf(m, "Suspend count: %u\n", error->suspend_count); err_printf(m, "Platform: %s\n", intel_platform_name(error->device_info.platform)); @@ -751,19 +726,15 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, if (IS_GEN(m->i915, 7)) err_printf(m, "ERR_INT: 0x%08x\n", error->err_int); - for (i = 0; i < ARRAY_SIZE(error->engine); i++) { - if (error->engine[i].engine_id != -1) - error_print_engine(m, - &error->engine[i], - error->capture); - } + for (ee = error->engine; ee; ee = ee->next) + error_print_engine(m, ee, error->capture); - for (i = 0; i < ARRAY_SIZE(error->engine); i++) { - const struct drm_i915_error_engine *ee = &error->engine[i]; + for (ee = error->engine; ee; ee = ee->next) { + const struct drm_i915_error_object *obj; obj = ee->batchbuffer; if (obj) { - err_puts(m, m->i915->engine[i]->name); + err_puts(m, ee->engine->name); if (ee->context.pid) err_printf(m, " (submitted by %s [%d])", ee->context.comm, @@ -771,16 +742,15 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, err_printf(m, " --- gtt_offset = 0x%08x %08x\n", upper_32_bits(obj->gtt_offset), lower_32_bits(obj->gtt_offset)); - print_error_obj(m, m->i915->engine[i], NULL, obj); + print_error_obj(m, ee->engine, NULL, obj); } for (j = 0; j < ee->user_bo_count; j++) - print_error_obj(m, m->i915->engine[i], - "user", ee->user_bo[j]); + print_error_obj(m, ee->engine, "user", ee->user_bo[j]); if (ee->num_requests) { err_printf(m, "%s --- %d requests\n", - m->i915->engine[i]->name, + ee->engine->name, ee->num_requests); for (j = 0; j < ee->num_requests; j++) error_print_request(m, " ", @@ -788,22 +758,13 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, error->capture); } - print_error_obj(m, m->i915->engine[i], - "ringbuffer", ee->ringbuffer); - - print_error_obj(m, m->i915->engine[i], - "HW Status", ee->hws_page); - - print_error_obj(m, m->i915->engine[i], - "HW context", ee->ctx); - - print_error_obj(m, m->i915->engine[i], - "WA context", ee->wa_ctx); - - print_error_obj(m, m->i915->engine[i], + print_error_obj(m, ee->engine, "ringbuffer", ee->ringbuffer); + print_error_obj(m, ee->engine, "HW Status", ee->hws_page); + print_error_obj(m, ee->engine, "HW context", ee->ctx); + print_error_obj(m, ee->engine, "WA context", ee->wa_ctx); + print_error_obj(m, ee->engine, "WA batchbuffer", ee->wa_batchbuffer); - - print_error_obj(m, m->i915->engine[i], + print_error_obj(m, ee->engine, "NULL context", ee->default_state); } @@ -952,13 +913,15 @@ void __i915_gpu_state_free(struct kref *error_ref) { struct i915_gpu_state *error = container_of(error_ref, typeof(*error), ref); - long i, j; + long i; - for (i = 0; i < ARRAY_SIZE(error->engine); i++) { - struct drm_i915_error_engine *ee = &error->engine[i]; + while (error->engine) { + struct drm_i915_error_engine *ee = error->engine; - for (j = 0; j < ee->user_bo_count; j++) - i915_error_object_free(ee->user_bo[j]); + error->engine = ee->next; + + for (i = 0; i < ee->user_bo_count; i++) + i915_error_object_free(ee->user_bo[i]); kfree(ee->user_bo); i915_error_object_free(ee->batchbuffer); @@ -969,6 +932,7 @@ void __i915_gpu_state_free(struct kref *error_ref) i915_error_object_free(ee->wa_ctx); kfree(ee->requests); + kfree(ee); } kfree(error->overlay); @@ -1050,23 +1014,17 @@ i915_error_object_create(struct drm_i915_private *i915, * * It's only a small step better than a random number in its current form. */ -static u32 i915_error_generate_code(struct i915_gpu_state *error, - intel_engine_mask_t engine_mask) +static u32 i915_error_generate_code(struct i915_gpu_state *error) { + const struct drm_i915_error_engine *ee = error->engine; + /* * IPEHR would be an ideal way to detect errors, as it's the gross * measure of "the command that hung." However, has some very common * synchronization commands which almost always appear in the case * strictly a client bug. Use instdone to differentiate those some. */ - if (engine_mask) { - struct drm_i915_error_engine *ee = - &error->engine[ffs(engine_mask)]; - - return ee->ipehr ^ ee->instdone.instdone; - } - - return 0; + return ee ? ee->ipehr ^ ee->instdone.instdone : 0; } static void gem_record_fences(struct i915_gpu_state *error) @@ -1274,9 +1232,11 @@ static void error_record_engine_execlists(const struct intel_engine_cs *engine, ee->num_ports = n; } -static void record_context(struct drm_i915_error_context *e, - struct i915_gem_context *ctx) +static bool record_context(struct drm_i915_error_context *e, + const struct i915_request *rq) { + const struct i915_gem_context *ctx = rq->gem_context; + if (ctx->pid) { struct task_struct *task; @@ -1293,6 +1253,8 @@ static void record_context(struct drm_i915_error_context *e, e->sched_attr = ctx->sched; e->guilty = atomic_read(&ctx->guilty_count); e->active = atomic_read(&ctx->active_count); + + return i915_gem_context_no_error_capture(ctx); } struct capture_vma { @@ -1387,74 +1349,67 @@ static void gem_record_rings(struct i915_gpu_state *error, struct compress *compress) { struct drm_i915_private *i915 = error->i915; - int i; + struct intel_engine_cs *engine; + struct drm_i915_error_engine *ee; + + ee = kzalloc(sizeof(*ee), GFP_KERNEL); + if (!ee) + return; - for (i = 0; i < I915_NUM_ENGINES; i++) { - struct intel_engine_cs *engine = i915->engine[i]; - struct drm_i915_error_engine *ee = &error->engine[i]; + for_each_user_engine(engine, i915) { struct capture_vma *capture = NULL; struct i915_request *request; unsigned long flags; - ee->engine_id = -1; - - if (!engine) - continue; - - ee->engine_id = i; - /* Refill our page pool before entering atomic section */ pool_refill(&compress->pool, ALLOW_FAIL); - error_record_engine_registers(error, engine, ee); - error_record_engine_execlists(engine, ee); - spin_lock_irqsave(&engine->active.lock, flags); request = intel_engine_find_active_request(engine); - if (request) { - struct i915_gem_context *ctx = request->gem_context; - struct intel_ring *ring = request->ring; - - record_context(&ee->context, ctx); - - /* - * We need to copy these to an anonymous buffer - * as the simplest method to avoid being overwritten - * by userspace. - */ - capture = capture_vma(capture, - request->batch, - &ee->batchbuffer); + if (!request) { + spin_unlock_irqrestore(&engine->active.lock, flags); + continue; + } - if (HAS_BROKEN_CS_TLB(i915)) - capture = capture_vma(capture, - engine->gt->scratch, - &ee->wa_batchbuffer); + error->simulated |= record_context(&ee->context, request); - capture = request_record_user_bo(request, ee, capture); + /* + * We need to copy these to an anonymous buffer + * as the simplest method to avoid being overwritten + * by userspace. + */ + capture = capture_vma(capture, + request->batch, + &ee->batchbuffer); + if (HAS_BROKEN_CS_TLB(i915)) capture = capture_vma(capture, - request->hw_context->state, - &ee->ctx); + engine->gt->scratch, + &ee->wa_batchbuffer); - capture = capture_vma(capture, - ring->vma, - &ee->ringbuffer); + capture = request_record_user_bo(request, ee, capture); - error->simulated |= - i915_gem_context_no_error_capture(ctx); + capture = capture_vma(capture, + request->hw_context->state, + &ee->ctx); - ee->rq_head = request->head; - ee->rq_post = request->postfix; - ee->rq_tail = request->tail; + capture = capture_vma(capture, + request->ring->vma, + &ee->ringbuffer); - ee->cpu_ring_head = ring->head; - ee->cpu_ring_tail = ring->tail; + ee->cpu_ring_head = request->ring->head; + ee->cpu_ring_tail = request->ring->tail; - engine_record_requests(engine, request, ee); - } + ee->rq_head = request->head; + ee->rq_post = request->postfix; + ee->rq_tail = request->tail; + + engine_record_requests(engine, request, ee); spin_unlock_irqrestore(&engine->active.lock, flags); + error_record_engine_registers(error, engine, ee); + error_record_engine_execlists(engine, ee); + while (capture) { struct capture_vma *this = capture; struct i915_vma *vma = *this->slot; @@ -1481,7 +1436,18 @@ gem_record_rings(struct i915_gpu_state *error, struct compress *compress) ee->default_state = capture_object(i915, engine->default_state, compress); + + ee->engine = engine; + + ee->next = error->engine; + error->engine = ee; + + ee = kzalloc(sizeof(*ee), GFP_KERNEL); + if (!ee) + return; } + + kfree(ee); } static void @@ -1610,24 +1576,18 @@ error_msg(struct i915_gpu_state *error, intel_engine_mask_t engines, const char *msg) { int len; - int i; - - for (i = 0; i < ARRAY_SIZE(error->engine); i++) - if (!error->engine[i].context.pid) - engines &= ~BIT(i); len = scnprintf(error->error_msg, sizeof(error->error_msg), "GPU HANG: ecode %d:%x:0x%08x", INTEL_GEN(error->i915), engines, - i915_error_generate_code(error, engines)); - if (engines) { + i915_error_generate_code(error)); + if (error->engine) { /* Just show the first executing process, more is confusing */ - i = __ffs(engines); len += scnprintf(error->error_msg + len, sizeof(error->error_msg) - len, ", in %s [%d]", - error->engine[i].context.comm, - error->engine[i].context.pid); + error->engine->context.comm, + error->engine->context.pid); } if (msg) len += scnprintf(error->error_msg + len, diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index c4ba297ca862..0ed061ee3378 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -80,7 +80,8 @@ struct i915_gpu_state { struct intel_display_error_state *display; struct drm_i915_error_engine { - int engine_id; + const struct intel_engine_cs *engine; + /* Software tracked state */ bool idle; int num_requests; @@ -156,7 +157,9 @@ struct i915_gpu_state { u32 pp_dir_base; }; } vm_info; - } engine[I915_NUM_ENGINES]; + + struct drm_i915_error_engine *next; + } *engine; struct scatterlist *sgl, *fit; }; From patchwork Fri Jul 26 08:46:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060593 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C86D746 for ; Fri, 26 Jul 2019 09:04:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D93528A60 for ; Fri, 26 Jul 2019 09:04:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 11C4828A6A; Fri, 26 Jul 2019 09:04:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B42D028A60 for ; Fri, 26 Jul 2019 09:04:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ECE4E6E8C5; Fri, 26 Jul 2019 09:04:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8CF006E8C5 for ; Fri, 26 Jul 2019 09:04:03 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621092-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:03 +0100 Message-Id: <20190726084613.22129-17-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 17/27] drm/i915: Teach execbuffer to take the engine wakeref not GT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP In the next patch, we would like to couple into the engine wakeref to free the batch pool on idling. The caveat here is that we therefore want to track the engine wakeref more precisely and to hold it instead of the broader GT wakeref as we process the ioctl. v2: Avoid introducing odd semantics for a shortlived timeline->mutex acquisition interface. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 42 +++++++++++++------ 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 4db4463089ce..8d90498eaf46 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2139,14 +2139,40 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) if (err) return err; + /* + * Take a local wakeref for preparing to dispatch the execbuf as + * we expect to access the hardware fairly frequently in the + * process, and require the engine to be kept awake between accesses. + * Upon dispatch, we acquire another prolonged wakeref that we hold + * until the timeline is idle, which in turn releases the wakeref + * taken on the engine, and the parent device. + */ + err = intel_context_timeline_lock(ce); + if (err) + goto err_unpin; + + intel_context_enter(ce); + intel_context_timeline_unlock(ce); + eb->engine = ce->engine; eb->context = ce; return 0; + +err_unpin: + intel_context_unpin(ce); + return err; } static void eb_unpin_context(struct i915_execbuffer *eb) { - intel_context_unpin(eb->context); + struct intel_context *ce = eb->context; + struct intel_timeline *tl = ce->ring->timeline; + + mutex_lock(&tl->mutex); + intel_context_exit(ce); + mutex_unlock(&tl->mutex); + + intel_context_unpin(ce); } static unsigned int @@ -2426,18 +2452,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_destroy; - /* - * Take a local wakeref for preparing to dispatch the execbuf as - * we expect to access the hardware fairly frequently in the - * process. Upon first dispatch, we acquire another prolonged - * wakeref that we hold until the GPU has been idle for at least - * 100ms. - */ - intel_gt_pm_get(&eb.i915->gt); - err = i915_mutex_lock_interruptible(dev); if (err) - goto err_rpm; + goto err_context; err = eb_select_engine(&eb, file, args); if (unlikely(err)) @@ -2602,8 +2619,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb_unpin_context(&eb); err_unlock: mutex_unlock(&dev->struct_mutex); -err_rpm: - intel_gt_pm_put(&eb.i915->gt); +err_context: i915_gem_context_put(eb.gem_context); err_destroy: eb_destroy(&eb); From patchwork Fri Jul 26 08:46:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060543 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11FBC13A0 for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0235E205FC for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EA11A28A3B; Fri, 26 Jul 2019 08:46:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4D985205FC for ; Fri, 26 Jul 2019 08:46:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 24DDE6E8B1; Fri, 26 Jul 2019 08:46:33 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B364F6E8AC for ; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621093-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:04 +0100 Message-Id: <20190726084613.22129-18-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/27] drm/i915/gt: Track timeline activeness in enter/exit X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Lift moving the timeline to/from the active_list on enter/exit in order to shorten the active tracking span in comparison to the existing pin/unpin. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_pm.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 1 + drivers/gpu/drm/i915/gt/intel_lrc.c | 4 + drivers/gpu/drm/i915/gt/intel_timeline.c | 98 +++++++------------ drivers/gpu/drm/i915/gt/intel_timeline.h | 3 +- .../gpu/drm/i915/gt/intel_timeline_types.h | 18 ++++ drivers/gpu/drm/i915/gt/selftest_timeline.c | 2 - 8 files changed, 63 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index a5a0aefcc04c..392a1e4fdd76 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -39,7 +39,6 @@ static void i915_gem_park(struct drm_i915_private *i915) i915_gem_batch_pool_fini(&engine->batch_pool); } - intel_timelines_park(i915); i915_vma_parked(i915); i915_globals_park(); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index eb46b7e3c1d0..2bae9bcffc1c 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -261,10 +261,12 @@ int __init i915_global_context_init(void) void intel_context_enter_engine(struct intel_context *ce) { intel_engine_pm_get(ce->engine); + intel_timeline_enter(ce->ring->timeline); } void intel_context_exit_engine(struct intel_context *ce) { + intel_timeline_exit(ce->ring->timeline); intel_engine_pm_put(ce->engine); } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 6349bb038c72..7bbc25631979 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -90,6 +90,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) /* Check again on the next retirement. */ engine->wakeref_serial = engine->serial + 1; + intel_timeline_enter(rq->timeline); i915_request_add_barriers(rq); __i915_request_commit(rq); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index c899ce34d0d3..88e0ad93b9a9 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3309,6 +3309,8 @@ static void virtual_context_enter(struct intel_context *ce) for (n = 0; n < ve->num_siblings; n++) intel_engine_pm_get(ve->siblings[n]); + + intel_timeline_enter(ce->ring->timeline); } static void virtual_context_exit(struct intel_context *ce) @@ -3316,6 +3318,8 @@ static void virtual_context_exit(struct intel_context *ce) struct virtual_engine *ve = container_of(ce, typeof(*ve), context); unsigned int n; + intel_timeline_exit(ce->ring->timeline); + for (n = 0; n < ve->num_siblings; n++) intel_engine_pm_put(ve->siblings[n]); } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 6daa9eb59e19..4af0b9801d91 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -278,64 +278,11 @@ void intel_timelines_init(struct drm_i915_private *i915) timelines_init(&i915->gt); } -static void timeline_add_to_active(struct intel_timeline *tl) -{ - struct intel_gt_timelines *gt = &tl->gt->timelines; - - mutex_lock(>->mutex); - list_add(&tl->link, >->active_list); - mutex_unlock(>->mutex); -} - -static void timeline_remove_from_active(struct intel_timeline *tl) -{ - struct intel_gt_timelines *gt = &tl->gt->timelines; - - mutex_lock(>->mutex); - list_del(&tl->link); - mutex_unlock(>->mutex); -} - -static void timelines_park(struct intel_gt *gt) -{ - struct intel_gt_timelines *timelines = >->timelines; - struct intel_timeline *timeline; - - mutex_lock(&timelines->mutex); - list_for_each_entry(timeline, &timelines->active_list, link) { - /* - * All known fences are completed so we can scrap - * the current sync point tracking and start afresh, - * any attempt to wait upon a previous sync point - * will be skipped as the fence was signaled. - */ - i915_syncmap_free(&timeline->sync); - } - mutex_unlock(&timelines->mutex); -} - -/** - * intel_timelines_park - called when the driver idles - * @i915: the drm_i915_private device - * - * When the driver is completely idle, we know that all of our sync points - * have been signaled and our tracking is then entirely redundant. Any request - * to wait upon an older sync point will be completed instantly as we know - * the fence is signaled and therefore we will not even look them up in the - * sync point map. - */ -void intel_timelines_park(struct drm_i915_private *i915) -{ - timelines_park(&i915->gt); -} - void intel_timeline_fini(struct intel_timeline *timeline) { GEM_BUG_ON(timeline->pin_count); GEM_BUG_ON(!list_empty(&timeline->requests)); - i915_syncmap_free(&timeline->sync); - if (timeline->hwsp_cacheline) cacheline_free(timeline->hwsp_cacheline); else @@ -370,6 +317,7 @@ int intel_timeline_pin(struct intel_timeline *tl) if (tl->pin_count++) return 0; GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(tl->active_count); err = i915_vma_pin(tl->hwsp_ggtt, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) @@ -380,7 +328,6 @@ int intel_timeline_pin(struct intel_timeline *tl) offset_in_page(tl->hwsp_offset); cacheline_acquire(tl->hwsp_cacheline); - timeline_add_to_active(tl); return 0; @@ -389,6 +336,40 @@ int intel_timeline_pin(struct intel_timeline *tl) return err; } +void intel_timeline_enter(struct intel_timeline *tl) +{ + struct intel_gt_timelines *timelines = &tl->gt->timelines; + + GEM_BUG_ON(!tl->pin_count); + if (tl->active_count++) + return; + GEM_BUG_ON(!tl->active_count); /* overflow? */ + + mutex_lock(&timelines->mutex); + list_add(&tl->link, &timelines->active_list); + mutex_unlock(&timelines->mutex); +} + +void intel_timeline_exit(struct intel_timeline *tl) +{ + struct intel_gt_timelines *timelines = &tl->gt->timelines; + + GEM_BUG_ON(!tl->active_count); + if (--tl->active_count) + return; + + mutex_lock(&timelines->mutex); + list_del(&tl->link); + mutex_unlock(&timelines->mutex); + + /* + * Since this timeline is idle, all bariers upon which we were waiting + * must also be complete and so we can discard the last used barriers + * without loss of information. + */ + i915_syncmap_free(&tl->sync); +} + static u32 timeline_advance(struct intel_timeline *tl) { GEM_BUG_ON(!tl->pin_count); @@ -546,16 +527,9 @@ void intel_timeline_unpin(struct intel_timeline *tl) if (--tl->pin_count) return; - timeline_remove_from_active(tl); + GEM_BUG_ON(tl->active_count); cacheline_release(tl->hwsp_cacheline); - /* - * Since this timeline is idle, all bariers upon which we were waiting - * must also be complete and so we can discard the last used barriers - * without loss of information. - */ - i915_syncmap_free(&tl->sync); - __i915_vma_unpin(tl->hwsp_ggtt); } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h index e08cebf64833..f583af1ba18d 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline.h @@ -77,9 +77,11 @@ static inline bool intel_timeline_sync_is_later(struct intel_timeline *tl, } int intel_timeline_pin(struct intel_timeline *tl); +void intel_timeline_enter(struct intel_timeline *tl); int intel_timeline_get_seqno(struct intel_timeline *tl, struct i915_request *rq, u32 *seqno); +void intel_timeline_exit(struct intel_timeline *tl); void intel_timeline_unpin(struct intel_timeline *tl); int intel_timeline_read_hwsp(struct i915_request *from, @@ -87,7 +89,6 @@ int intel_timeline_read_hwsp(struct i915_request *from, u32 *hwsp_offset); void intel_timelines_init(struct drm_i915_private *i915); -void intel_timelines_park(struct drm_i915_private *i915); void intel_timelines_fini(struct drm_i915_private *i915); #endif diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h index 9a71aea7a338..b1a9f0c54bf0 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h @@ -25,7 +25,25 @@ struct intel_timeline { struct mutex mutex; /* protects the flow of requests */ + /* + * pin_count and active_count track essentially the same thing: + * How many requests are in flight or may be under construction. + * + * We need two distinct counters so that we can assign different + * lifetimes to the events for different use-cases. For example, + * we want to permanently keep the timeline pinned for the kernel + * context so that we can issue requests at any time without having + * to acquire space in the GGTT. However, we want to keep tracking + * the activity (to be able to detect when we become idle) along that + * permanently pinned timeline and so end up requiring two counters. + * + * Note that the active_count is protected by the intel_timeline.mutex, + * but the pin_count is protected by a combination of serialisation + * from the intel_context caller plus internal atomicity. + */ unsigned int pin_count; + unsigned int active_count; + const u32 *hwsp_seqno; struct i915_vma *hwsp_ggtt; u32 hwsp_offset; diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c index f0a840030382..d54113697745 100644 --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c @@ -816,8 +816,6 @@ static int live_hwsp_recycle(void *arg) if (err) goto out; - - intel_timelines_park(i915); /* Encourage recycling! */ } while (!__igt_timeout(end_time, NULL)); } From patchwork Fri Jul 26 08:46:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060545 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A9CA1398 for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B683205FC for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8002428A3B; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 17FE7205FC for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BDAEA6E8AE; Fri, 26 Jul 2019 08:46:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 86A266E8A9 for ; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621094-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:05 +0100 Message-Id: <20190726084613.22129-19-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 19/27] drm/i915/gt: Convert timeline tracking to spinlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Convert the list manipulation of active to use spinlocks so that we can perform the updates from underneath a quick interrupt callback. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 10 ++++++++-- drivers/gpu/drm/i915/gt/intel_timeline.c | 12 +++++------- drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++---------- 4 files changed, 24 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 5fa7bf793f0f..b097d83fe112 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -32,7 +32,7 @@ struct intel_gt { struct intel_uc uc; struct intel_gt_timelines { - struct mutex mutex; /* protects list */ + spinlock_t lock; /* protects active_list */ struct list_head active_list; /* Pack multiple timelines' seqnos into the same page */ diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index c937fa80cc3e..bbd1ecac09c2 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -810,7 +810,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) * * No more can be submitted until we reset the wedged bit. */ - mutex_lock(&timelines->mutex); + spin_lock(&timelines->lock); list_for_each_entry(tl, &timelines->active_list, link) { struct i915_request *rq; @@ -818,6 +818,8 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) if (!rq) continue; + spin_unlock(&timelines->lock); + /* * All internal dependencies (i915_requests) will have * been flushed by the set-wedge, but we may be stuck waiting @@ -827,8 +829,12 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) */ dma_fence_default_wait(&rq->fence, false, MAX_SCHEDULE_TIMEOUT); i915_request_put(rq); + + /* Restart iteration after droping lock */ + spin_lock(&timelines->lock); + tl = list_entry(&timelines->active_list, typeof(*tl), link); } - mutex_unlock(&timelines->mutex); + spin_unlock(&timelines->lock); intel_gt_sanitize(gt, false); diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 4af0b9801d91..355dfc52c804 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -266,7 +266,7 @@ static void timelines_init(struct intel_gt *gt) { struct intel_gt_timelines *timelines = >->timelines; - mutex_init(&timelines->mutex); + spin_lock_init(&timelines->lock); INIT_LIST_HEAD(&timelines->active_list); spin_lock_init(&timelines->hwsp_lock); @@ -345,9 +345,9 @@ void intel_timeline_enter(struct intel_timeline *tl) return; GEM_BUG_ON(!tl->active_count); /* overflow? */ - mutex_lock(&timelines->mutex); + spin_lock(&timelines->lock); list_add(&tl->link, &timelines->active_list); - mutex_unlock(&timelines->mutex); + spin_unlock(&timelines->lock); } void intel_timeline_exit(struct intel_timeline *tl) @@ -358,9 +358,9 @@ void intel_timeline_exit(struct intel_timeline *tl) if (--tl->active_count) return; - mutex_lock(&timelines->mutex); + spin_lock(&timelines->lock); list_del(&tl->link); - mutex_unlock(&timelines->mutex); + spin_unlock(&timelines->lock); /* * Since this timeline is idle, all bariers upon which we were waiting @@ -548,8 +548,6 @@ static void timelines_fini(struct intel_gt *gt) GEM_BUG_ON(!list_empty(&timelines->active_list)); GEM_BUG_ON(!list_empty(&timelines->hwsp_free_list)); - - mutex_destroy(&timelines->mutex); } void intel_timelines_fini(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a84052f247f2..c7650f15d36b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -907,20 +907,20 @@ static int wait_for_engines(struct intel_gt *gt) static long wait_for_timelines(struct drm_i915_private *i915, - unsigned int flags, long timeout) + unsigned int wait, long timeout) { - struct intel_gt_timelines *gt = &i915->gt.timelines; + struct intel_gt_timelines *timelines = &i915->gt.timelines; struct intel_timeline *tl; - mutex_lock(>->mutex); - list_for_each_entry(tl, >->active_list, link) { + spin_lock(&timelines->lock); + list_for_each_entry(tl, &timelines->active_list, link) { struct i915_request *rq; rq = i915_active_request_get_unlocked(&tl->last_request); if (!rq) continue; - mutex_unlock(>->mutex); + spin_unlock(&timelines->lock); /* * "Race-to-idle". @@ -931,19 +931,19 @@ wait_for_timelines(struct drm_i915_private *i915, * want to complete as quickly as possible to avoid prolonged * stalls, so allow the gpu to boost to maximum clocks. */ - if (flags & I915_WAIT_FOR_IDLE_BOOST) + if (wait & I915_WAIT_FOR_IDLE_BOOST) gen6_rps_boost(rq); - timeout = i915_request_wait(rq, flags, timeout); + timeout = i915_request_wait(rq, wait, timeout); i915_request_put(rq); if (timeout < 0) return timeout; /* restart after reacquiring the lock */ - mutex_lock(>->mutex); - tl = list_entry(>->active_list, typeof(*tl), link); + spin_lock(&timelines->lock); + tl = list_entry(&timelines->active_list, typeof(*tl), link); } - mutex_unlock(>->mutex); + spin_unlock(&timelines->lock); return timeout; } From patchwork Fri Jul 26 08:46:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060563 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8351214E5 for ; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 73C26205FC for ; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6815328A3B; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0FE6D205FC for ; Fri, 26 Jul 2019 08:46:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5D6076ECBF; Fri, 26 Jul 2019 08:46:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 569516E8A7 for ; Fri, 26 Jul 2019 08:46:28 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621095-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:06 +0100 Message-Id: <20190726084613.22129-20-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 20/27] drm/i915/gt: Guard timeline pinning with its own mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP In preparation for removing struct_mutex from around context retirement, we need to make timeline pinning safe. Since multiple engines/contexts can share a single timeline, it needs to be protected by a mutex. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_timeline.c | 27 +++++++++---------- .../gpu/drm/i915/gt/intel_timeline_types.h | 2 +- drivers/gpu/drm/i915/gt/mock_engine.c | 6 ++--- 3 files changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 355dfc52c804..7b476cd55dac 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -211,9 +211,9 @@ int intel_timeline_init(struct intel_timeline *timeline, void *vaddr; kref_init(&timeline->kref); + atomic_set(&timeline->pin_count, 0); timeline->gt = gt; - timeline->pin_count = 0; timeline->has_initial_breadcrumb = !hwsp; timeline->hwsp_cacheline = NULL; @@ -280,7 +280,7 @@ void intel_timelines_init(struct drm_i915_private *i915) void intel_timeline_fini(struct intel_timeline *timeline) { - GEM_BUG_ON(timeline->pin_count); + GEM_BUG_ON(atomic_read(&timeline->pin_count)); GEM_BUG_ON(!list_empty(&timeline->requests)); if (timeline->hwsp_cacheline) @@ -314,33 +314,31 @@ int intel_timeline_pin(struct intel_timeline *tl) { int err; - if (tl->pin_count++) + if (atomic_add_unless(&tl->pin_count, 1, 0)) return 0; - GEM_BUG_ON(!tl->pin_count); - GEM_BUG_ON(tl->active_count); err = i915_vma_pin(tl->hwsp_ggtt, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) - goto unpin; + return err; tl->hwsp_offset = i915_ggtt_offset(tl->hwsp_ggtt) + offset_in_page(tl->hwsp_offset); cacheline_acquire(tl->hwsp_cacheline); + if (atomic_fetch_inc(&tl->pin_count)) { + cacheline_release(tl->hwsp_cacheline); + __i915_vma_unpin(tl->hwsp_ggtt); + } return 0; - -unpin: - tl->pin_count = 0; - return err; } void intel_timeline_enter(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; - GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(!atomic_read(&tl->pin_count)); if (tl->active_count++) return; GEM_BUG_ON(!tl->active_count); /* overflow? */ @@ -372,7 +370,7 @@ void intel_timeline_exit(struct intel_timeline *tl) static u32 timeline_advance(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(!atomic_read(&tl->pin_count)); GEM_BUG_ON(tl->seqno & tl->has_initial_breadcrumb); return tl->seqno += 1 + tl->has_initial_breadcrumb; @@ -523,11 +521,10 @@ int intel_timeline_read_hwsp(struct i915_request *from, void intel_timeline_unpin(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); - if (--tl->pin_count) + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + if (!atomic_dec_and_test(&tl->pin_count)) return; - GEM_BUG_ON(tl->active_count); cacheline_release(tl->hwsp_cacheline); __i915_vma_unpin(tl->hwsp_ggtt); diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h index b1a9f0c54bf0..2b1baf2fcc8e 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h @@ -41,7 +41,7 @@ struct intel_timeline { * but the pin_count is protected by a combination of serialisation * from the intel_context caller plus internal atomicity. */ - unsigned int pin_count; + atomic_t pin_count; unsigned int active_count; const u32 *hwsp_seqno; diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index ba076886e79e..284f2e5ad2cf 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -38,13 +38,13 @@ struct mock_ring { static void mock_timeline_pin(struct intel_timeline *tl) { - tl->pin_count++; + atomic_inc(&tl->pin_count); } static void mock_timeline_unpin(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); - tl->pin_count--; + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + atomic_dec(&tl->pin_count); } static struct intel_ring *mock_ring(struct intel_engine_cs *engine) From patchwork Fri Jul 26 08:46:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060557 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D5B113A0 for ; Fri, 26 Jul 2019 08:46:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C2BF28894 for ; Fri, 26 Jul 2019 08:46:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4032E28A3D; Fri, 26 Jul 2019 08:46:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D426F28894 for ; Fri, 26 Jul 2019 08:46:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 51C176E8BC; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id AB49F6E8A7 for ; Fri, 26 Jul 2019 08:46:27 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621096-1500050 for multiple; Fri, 26 Jul 2019 09:46:17 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:07 +0100 Message-Id: <20190726084613.22129-21-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 21/27] drm/i915: Protect request retirement with timeline->mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Forgo the struct_mutex requirement for request retirement as we have been transitioning over to only using the timeline->mutex for controlling the lifetime of a request on that timeline. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 183 ++++++++++-------- drivers/gpu/drm/i915/gt/intel_context.h | 18 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 1 - .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 6 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 - drivers/gpu/drm/i915/gt/intel_gt.c | 1 - drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 - drivers/gpu/drm/i915/gt/intel_lrc.c | 1 + drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 13 +- drivers/gpu/drm/i915/gt/mock_engine.c | 1 - drivers/gpu/drm/i915/gt/selftest_context.c | 9 +- drivers/gpu/drm/i915/i915_request.c | 151 +++++++-------- drivers/gpu/drm/i915/i915_request.h | 3 - 13 files changed, 208 insertions(+), 183 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 8d90498eaf46..44add172cdc8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -734,63 +734,6 @@ static int eb_select_context(struct i915_execbuffer *eb) return 0; } -static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring) -{ - struct i915_request *rq; - - /* - * Completely unscientific finger-in-the-air estimates for suitable - * maximum user request size (to avoid blocking) and then backoff. - */ - if (intel_ring_update_space(ring) >= PAGE_SIZE) - return NULL; - - /* - * Find a request that after waiting upon, there will be at least half - * the ring available. The hysteresis allows us to compete for the - * shared ring and should mean that we sleep less often prior to - * claiming our resources, but not so long that the ring completely - * drains before we can submit our next request. - */ - list_for_each_entry(rq, &ring->request_list, ring_link) { - if (__intel_ring_space(rq->postfix, - ring->emit, ring->size) > ring->size / 2) - break; - } - if (&rq->ring_link == &ring->request_list) - return NULL; /* weird, we will check again later for real */ - - return i915_request_get(rq); -} - -static int eb_wait_for_ring(const struct i915_execbuffer *eb) -{ - struct i915_request *rq; - int ret = 0; - - /* - * Apply a light amount of backpressure to prevent excessive hogs - * from blocking waiting for space whilst holding struct_mutex and - * keeping all of their resources pinned. - */ - - rq = __eb_wait_for_ring(eb->context->ring); - if (rq) { - mutex_unlock(&eb->i915->drm.struct_mutex); - - if (i915_request_wait(rq, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT) < 0) - ret = -EINTR; - - i915_request_put(rq); - - mutex_lock(&eb->i915->drm.struct_mutex); - } - - return ret; -} - static int eb_lookup_vmas(struct i915_execbuffer *eb) { struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma; @@ -2118,8 +2061,73 @@ static const enum intel_engine_id user_ring_map[] = { [I915_EXEC_VEBOX] = VECS0 }; -static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) +static struct i915_request *eb_throttle(struct intel_context *ce) { + struct intel_ring *ring = ce->ring; + struct intel_timeline *tl = ring->timeline; + struct i915_request *rq; + + /* + * Completely unscientific finger-in-the-air estimates for suitable + * maximum user request size (to avoid blocking) and then backoff. + */ + if (intel_ring_update_space(ring) >= PAGE_SIZE) + return NULL; + + /* + * Find a request that after waiting upon, there will be at least half + * the ring available. The hysteresis allows us to compete for the + * shared ring and should mean that we sleep less often prior to + * claiming our resources, but not so long that the ring completely + * drains before we can submit our next request. + */ + list_for_each_entry(rq, &tl->requests, link) { + if (rq->ring != ring) + continue; + + if (__intel_ring_space(rq->postfix, + ring->emit, ring->size) > ring->size / 2) + break; + } + if (&rq->link == &tl->requests) + return NULL; /* weird, we will check again later for real */ + + return i915_request_get(rq); +} + +static int +__eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) +{ + int err; + + if (likely(atomic_inc_not_zero(&ce->pin_count))) + return 0; + + err = mutex_lock_interruptible(&eb->i915->drm.struct_mutex); + if (err) + return err; + + err = __intel_context_do_pin(ce); + mutex_unlock(&eb->i915->drm.struct_mutex); + + return err; +} + +static void +__eb_unpin_context(struct i915_execbuffer *eb, struct intel_context *ce) +{ + if (likely(atomic_add_unless(&ce->pin_count, -1, 1))) + return; + + mutex_lock(&eb->i915->drm.struct_mutex); + intel_context_unpin(ce); + mutex_unlock(&eb->i915->drm.struct_mutex); +} + +static int __eb_pin_engine(struct i915_execbuffer *eb, struct intel_context *ce) +{ + struct intel_timeline *tl; + struct i915_request *rq; int err; /* @@ -2135,7 +2143,7 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) * GGTT space, so do this first before we reserve a seqno for * ourselves. */ - err = intel_context_pin(ce); + err = __eb_pin_context(eb, ce); if (err) return err; @@ -2147,23 +2155,43 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) * until the timeline is idle, which in turn releases the wakeref * taken on the engine, and the parent device. */ - err = intel_context_timeline_lock(ce); - if (err) + tl = intel_context_timeline_lock(ce); + if (IS_ERR(tl)) { + err = PTR_ERR(tl); goto err_unpin; + } intel_context_enter(ce); - intel_context_timeline_unlock(ce); + rq = eb_throttle(ce); + + intel_context_timeline_unlock(tl); + + if (rq) { + if (i915_request_wait(rq, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT) < 0) { + i915_request_put(rq); + err = -EINTR; + goto err_exit; + } + + i915_request_put(rq); + } eb->engine = ce->engine; eb->context = ce; return 0; +err_exit: + mutex_lock(&tl->mutex); + intel_context_exit(ce); + intel_context_timeline_unlock(tl); err_unpin: - intel_context_unpin(ce); + __eb_unpin_context(eb, ce); return err; } -static void eb_unpin_context(struct i915_execbuffer *eb) +static void eb_unpin_engine(struct i915_execbuffer *eb) { struct intel_context *ce = eb->context; struct intel_timeline *tl = ce->ring->timeline; @@ -2172,7 +2200,7 @@ static void eb_unpin_context(struct i915_execbuffer *eb) intel_context_exit(ce); mutex_unlock(&tl->mutex); - intel_context_unpin(ce); + __eb_unpin_context(eb, ce); } static unsigned int @@ -2217,9 +2245,9 @@ eb_select_legacy_ring(struct i915_execbuffer *eb, } static int -eb_select_engine(struct i915_execbuffer *eb, - struct drm_file *file, - struct drm_i915_gem_execbuffer2 *args) +eb_pin_engine(struct i915_execbuffer *eb, + struct drm_file *file, + struct drm_i915_gem_execbuffer2 *args) { struct intel_context *ce; unsigned int idx; @@ -2234,7 +2262,7 @@ eb_select_engine(struct i915_execbuffer *eb, if (IS_ERR(ce)) return PTR_ERR(ce); - err = eb_pin_context(eb, ce); + err = __eb_pin_engine(eb, ce); intel_context_put(ce); return err; @@ -2452,16 +2480,12 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_destroy; - err = i915_mutex_lock_interruptible(dev); - if (err) - goto err_context; - - err = eb_select_engine(&eb, file, args); + err = eb_pin_engine(&eb, file, args); if (unlikely(err)) - goto err_unlock; + goto err_context; - err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */ - if (unlikely(err)) + err = i915_mutex_lock_interruptible(dev); + if (err) goto err_engine; err = eb_relocate(&eb); @@ -2615,10 +2639,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_vma: if (eb.exec) eb_release_vmas(&eb); -err_engine: - eb_unpin_context(&eb); -err_unlock: mutex_unlock(&dev->struct_mutex); +err_engine: + eb_unpin_engine(&eb); err_context: i915_gem_context_put(eb.gem_context); err_destroy: diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 13f28dd316bc..cb63fc7b1b18 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -12,6 +12,7 @@ #include "i915_active.h" #include "intel_context_types.h" #include "intel_engine_types.h" +#include "intel_timeline_types.h" void intel_context_init(struct intel_context *ce, struct i915_gem_context *ctx, @@ -118,17 +119,24 @@ static inline void intel_context_put(struct intel_context *ce) kref_put(&ce->ref, ce->ops->destroy); } -static inline int __must_check +static inline struct intel_timeline *__must_check intel_context_timeline_lock(struct intel_context *ce) __acquires(&ce->ring->timeline->mutex) { - return mutex_lock_interruptible(&ce->ring->timeline->mutex); + struct intel_timeline *tl = ce->ring->timeline; + int err; + + err = mutex_lock_interruptible(&tl->mutex); + if (err) + return ERR_PTR(err); + + return tl; } -static inline void intel_context_timeline_unlock(struct intel_context *ce) - __releases(&ce->ring->timeline->mutex) +static inline void intel_context_timeline_unlock(struct intel_timeline *tl) + __releases(&tl->mutex) { - mutex_unlock(&ce->ring->timeline->mutex); + mutex_unlock(&tl->mutex); } int intel_context_prepare_remote_request(struct intel_context *ce, diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 095ddcbd75bf..7a7e385bb979 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -744,7 +744,6 @@ static int measure_breadcrumb_dw(struct intel_engine_cs *engine) engine->status_page.vma)) goto out_frame; - INIT_LIST_HEAD(&frame->ring.request_list); frame->ring.timeline = &frame->timeline; frame->ring.vaddr = frame->cs; frame->ring.size = sizeof(frame->cs); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 42330b1074e6..281a72af8406 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -24,6 +24,7 @@ static void heartbeat(struct work_struct *wrk) struct intel_engine_cs *engine = container_of(wrk, typeof(*engine), heartbeat.work); struct intel_context *ce = engine->kernel_context; + struct intel_timeline *tl; struct i915_request *rq; if (!intel_engine_pm_get_if_awake(engine)) @@ -57,7 +58,8 @@ static void heartbeat(struct work_struct *wrk) if (engine->wakeref_serial == engine->serial) goto out; - if (intel_context_timeline_lock(ce)) + tl = intel_context_timeline_lock(ce); + if (IS_ERR(tl)) goto out; intel_context_enter(ce); @@ -73,7 +75,7 @@ static void heartbeat(struct work_struct *wrk) __i915_request_commit(rq); unlock: - intel_context_timeline_unlock(ce); + intel_context_timeline_unlock(tl); out: schedule_delayed_work(&engine->heartbeat, delay()); intel_engine_pm_put(engine); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 0ef07ada0ae7..d8b82b6f7db9 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -63,8 +63,6 @@ struct intel_ring { void *vaddr; struct intel_timeline *timeline; - struct list_head request_list; - struct list_head active_link; /* * As we have two types of rings, one global to the engine used diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 117ca64149e1..c2a25c7adb46 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -14,7 +14,6 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915) gt->i915 = i915; gt->uncore = &i915->uncore; - INIT_LIST_HEAD(>->active_rings); INIT_LIST_HEAD(>->closed_vma); spin_lock_init(>->closed_lock); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index b097d83fe112..70b8f025e4da 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -40,8 +40,6 @@ struct intel_gt { struct list_head hwsp_free_list; } timelines; - struct list_head active_rings; - struct intel_wakeref wakeref; struct list_head closed_vma; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 88e0ad93b9a9..2be97d68a103 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1634,6 +1634,7 @@ static void execlists_context_unpin(struct intel_context *ce) { i915_gem_context_unpin_hw_id(ce->gem_context); i915_gem_object_unpin_map(ce->state->obj); + intel_ring_reset(ce->ring, ce->ring->tail); } static void diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index d8efb88f33f3..9d9be5fed9fc 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1275,7 +1275,7 @@ void intel_ring_unpin(struct intel_ring *ring) GEM_TRACE("ring:%llx unpin\n", ring->timeline->fence_context); /* Discard any unused bytes beyond that submitted to hw. */ - intel_ring_reset(ring, ring->tail); + intel_ring_reset(ring, ring->emit); i915_vma_unset_ggtt_write(vma); if (i915_vma_is_map_and_fenceable(vma)) @@ -1340,7 +1340,6 @@ intel_engine_create_ring(struct intel_engine_cs *engine, return ERR_PTR(-ENOMEM); kref_init(&ring->ref); - INIT_LIST_HEAD(&ring->request_list); ring->timeline = intel_timeline_get(timeline); ring->size = size; @@ -1888,21 +1887,25 @@ static int ring_request_alloc(struct i915_request *request) static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes) { + struct intel_timeline *tl = ring->timeline; struct i915_request *target; long timeout; if (intel_ring_update_space(ring) >= bytes) return 0; - GEM_BUG_ON(list_empty(&ring->request_list)); - list_for_each_entry(target, &ring->request_list, ring_link) { + GEM_BUG_ON(list_empty(&tl->requests)); + list_for_each_entry(target, &tl->requests, link) { + if (target->ring != ring) + continue; + /* Would completion of this request free enough space? */ if (bytes <= __intel_ring_space(target->postfix, ring->emit, ring->size)) break; } - if (WARN_ON(&target->ring_link == &ring->request_list)) + if (GEM_WARN_ON(&target->link == &tl->requests)) return -ENOSPC; timeout = i915_request_wait(target, diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 284f2e5ad2cf..3ed511e8e098 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -68,7 +68,6 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine) ring->base.timeline = &ring->timeline; atomic_set(&ring->base.pin_count, 1); - INIT_LIST_HEAD(&ring->base.request_list); intel_ring_update_space(&ring->base); return &ring->base; diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c index e3b45fe747ae..6dd1dfda3c5a 100644 --- a/drivers/gpu/drm/i915/gt/selftest_context.c +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -20,10 +20,13 @@ static int request_sync(struct i915_request *rq) i915_request_add(rq); timeout = i915_request_wait(rq, 0, HZ / 10); - if (timeout < 0) + if (timeout < 0) { err = timeout; - else + } else { + mutex_lock(&rq->timeline->mutex); i915_request_retire_upto(rq); + mutex_unlock(&rq->timeline->mutex); + } i915_request_put(rq); @@ -35,6 +38,7 @@ static int context_sync(struct intel_context *ce) struct intel_timeline *tl = ce->ring->timeline; int err = 0; + mutex_lock(&tl->mutex); do { struct i915_request *rq; long timeout; @@ -55,6 +59,7 @@ static int context_sync(struct intel_context *ce) i915_request_put(rq); } while (!err); + mutex_unlock(&tl->mutex); return err; } diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 81094f250bdb..d203763b7d86 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -180,40 +180,6 @@ i915_request_remove_from_client(struct i915_request *request) spin_unlock(&file_priv->mm.lock); } -static void advance_ring(struct i915_request *request) -{ - struct intel_ring *ring = request->ring; - unsigned int tail; - - /* - * We know the GPU must have read the request to have - * sent us the seqno + interrupt, so use the position - * of tail of the request to update the last known position - * of the GPU head. - * - * Note this requires that we are always called in request - * completion order. - */ - GEM_BUG_ON(!list_is_first(&request->ring_link, &ring->request_list)); - if (list_is_last(&request->ring_link, &ring->request_list)) { - /* - * We may race here with execlists resubmitting this request - * as we retire it. The resubmission will move the ring->tail - * forwards (to request->wa_tail). We either read the - * current value that was written to hw, or the value that - * is just about to be. Either works, if we miss the last two - * noops - they are safe to be replayed on a reset. - */ - tail = READ_ONCE(request->tail); - list_del(&ring->active_link); - } else { - tail = request->postfix; - } - list_del_init(&request->ring_link); - - ring->head = tail; -} - static void free_capture_list(struct i915_request *request) { struct i915_capture_list *capture; @@ -231,7 +197,7 @@ static bool i915_request_retire(struct i915_request *rq) { struct i915_active_request *active, *next; - lockdep_assert_held(&rq->i915->drm.struct_mutex); + lockdep_assert_held(&rq->timeline->mutex); if (!i915_request_completed(rq)) return false; @@ -243,7 +209,17 @@ static bool i915_request_retire(struct i915_request *rq) GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit)); trace_i915_request_retire(rq); - advance_ring(rq); + /* + * We know the GPU must have read the request to have + * sent us the seqno + interrupt, so use the position + * of tail of the request to update the last known position + * of the GPU head. + * + * Note this requires that we are always called in request + * completion order. + */ + GEM_BUG_ON(!list_is_first(&rq->link, &rq->timeline->requests)); + rq->ring->head = rq->postfix; /* * Walk through the active list, calling retire on each. This allows @@ -320,7 +296,7 @@ static bool i915_request_retire(struct i915_request *rq) void i915_request_retire_upto(struct i915_request *rq) { - struct intel_ring *ring = rq->ring; + struct intel_timeline * const tl = rq->timeline; struct i915_request *tmp; GEM_TRACE("%s fence %llx:%lld, current %d\n", @@ -328,15 +304,11 @@ void i915_request_retire_upto(struct i915_request *rq) rq->fence.context, rq->fence.seqno, hwsp_seqno(rq)); - lockdep_assert_held(&rq->i915->drm.struct_mutex); + lockdep_assert_held(&tl->mutex); GEM_BUG_ON(!i915_request_completed(rq)); - if (list_empty(&rq->ring_link)) - return; - do { - tmp = list_first_entry(&ring->request_list, - typeof(*tmp), ring_link); + tmp = list_first_entry(&tl->requests, typeof(*tmp), link); } while (i915_request_retire(tmp) && tmp != rq); } @@ -563,29 +535,28 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } -static void ring_retire_requests(struct intel_ring *ring) +static void retire_requests(struct intel_timeline *tl) { struct i915_request *rq, *rn; - list_for_each_entry_safe(rq, rn, &ring->request_list, ring_link) + list_for_each_entry_safe(rq, rn, &tl->requests, link) if (!i915_request_retire(rq)) break; } static noinline struct i915_request * -request_alloc_slow(struct intel_context *ce, gfp_t gfp) +request_alloc_slow(struct intel_timeline *tl, gfp_t gfp) { - struct intel_ring *ring = ce->ring; struct i915_request *rq; - if (list_empty(&ring->request_list)) + if (list_empty(&tl->requests)) goto out; if (!gfpflags_allow_blocking(gfp)) goto out; /* Move our oldest request to the slab-cache (if not in use!) */ - rq = list_first_entry(&ring->request_list, typeof(*rq), ring_link); + rq = list_first_entry(&tl->requests, typeof(*rq), link); i915_request_retire(rq); rq = kmem_cache_alloc(global.slab_requests, @@ -594,11 +565,11 @@ request_alloc_slow(struct intel_context *ce, gfp_t gfp) return rq; /* Ratelimit ourselves to prevent oom from malicious clients */ - rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link); + rq = list_last_entry(&tl->requests, typeof(*rq), link); cond_synchronize_rcu(rq->rcustate); /* Retire our old requests in the hope that we free some */ - ring_retire_requests(ring); + retire_requests(tl); out: return kmem_cache_alloc(global.slab_requests, gfp); @@ -649,7 +620,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq = kmem_cache_alloc(global.slab_requests, gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN); if (unlikely(!rq)) { - rq = request_alloc_slow(ce, gfp); + rq = request_alloc_slow(tl, gfp); if (!rq) { ret = -ENOMEM; goto err_unreserve; @@ -741,15 +712,15 @@ struct i915_request * i915_request_create(struct intel_context *ce) { struct i915_request *rq; - int err; + struct intel_timeline *tl; - err = intel_context_timeline_lock(ce); - if (err) - return ERR_PTR(err); + tl = intel_context_timeline_lock(ce); + if (IS_ERR(tl)) + return ERR_CAST(tl); /* Move our oldest request to the slab-cache (if not in use!) */ - rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link); - if (!list_is_last(&rq->ring_link, &ce->ring->request_list)) + rq = list_first_entry(&tl->requests, typeof(*rq), link); + if (!list_is_last(&rq->link, &tl->requests)) i915_request_retire(rq); intel_context_enter(ce); @@ -759,22 +730,22 @@ i915_request_create(struct intel_context *ce) goto err_unlock; /* Check that we do not interrupt ourselves with a new request */ - rq->cookie = lockdep_pin_lock(&ce->ring->timeline->mutex); + rq->cookie = lockdep_pin_lock(&tl->mutex); return rq; err_unlock: - intel_context_timeline_unlock(ce); + intel_context_timeline_unlock(tl); return rq; } static int i915_request_await_start(struct i915_request *rq, struct i915_request *signal) { - if (list_is_first(&signal->ring_link, &signal->ring->request_list)) + if (list_is_first(&signal->link, &signal->ring->timeline->requests)) return 0; - signal = list_prev_entry(signal, ring_link); + signal = list_prev_entry(signal, link); if (intel_timeline_sync_is_later(rq->timeline, &signal->fence)) return 0; @@ -1167,6 +1138,7 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) */ GEM_BUG_ON(rq->reserved_space > ring->space); rq->reserved_space = 0; + rq->emitted_jiffies = jiffies; /* * Record the position of the start of the breadcrumb so that @@ -1180,11 +1152,6 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) prev = __i915_request_add_to_timeline(rq); - list_add_tail(&rq->ring_link, &ring->request_list); - if (list_is_first(&rq->ring_link, &ring->request_list)) - list_add(&ring->active_link, &rq->i915->gt.active_rings); - rq->emitted_jiffies = jiffies; - /* * Let the backend know a new request has arrived that may need * to adjust the existing execution schedule due to a high priority @@ -1235,10 +1202,11 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) void i915_request_add(struct i915_request *rq) { + struct intel_timeline * const tl = rq->timeline; struct i915_request *prev; - lockdep_assert_held(&rq->timeline->mutex); - lockdep_unpin_lock(&rq->timeline->mutex, rq->cookie); + lockdep_assert_held(&tl->mutex); + lockdep_unpin_lock(&tl->mutex, rq->cookie); trace_i915_request_add(rq); @@ -1261,10 +1229,10 @@ void i915_request_add(struct i915_request *rq) * work on behalf of others -- but instead we should benefit from * improved resource management. (Well, that's the theory at least.) */ - if (prev && i915_request_completed(prev)) + if (prev && i915_request_completed(prev) && prev->timeline == tl) i915_request_retire_upto(prev); - mutex_unlock(&rq->timeline->mutex); + mutex_unlock(&tl->mutex); } static unsigned long local_clock_us(unsigned int *cpu) @@ -1484,18 +1452,43 @@ long i915_request_wait(struct i915_request *rq, bool i915_retire_requests(struct drm_i915_private *i915) { - struct intel_ring *ring, *tmp; + struct intel_gt_timelines *timelines = &i915->gt.timelines; + struct intel_timeline *tl, *tn; + LIST_HEAD(free); + + spin_lock(&timelines->lock); + list_for_each_entry_safe(tl, tn, &timelines->active_list, link) { + if (!mutex_trylock(&tl->mutex)) + continue; + + intel_timeline_get(tl); + GEM_BUG_ON(!tl->active_count); + tl->active_count++; /* pin the list element */ + spin_unlock(&timelines->lock); - lockdep_assert_held(&i915->drm.struct_mutex); + retire_requests(tl); - list_for_each_entry_safe(ring, tmp, - &i915->gt.active_rings, active_link) { - intel_ring_get(ring); /* last rq holds reference! */ - ring_retire_requests(ring); - intel_ring_put(ring); + spin_lock(&timelines->lock); + + /* Restart iteration after dropping lock */ + list_safe_reset_next(tl, tn, link); + if (!--tl->active_count) + list_del(&tl->link); + + mutex_unlock(&tl->mutex); + + /* Defer the final release to after the spinlock */ + if (refcount_dec_and_test(&tl->kref.refcount)) { + GEM_BUG_ON(tl->active_count); + list_add(&tl->link, &free); + } } + spin_unlock(&timelines->lock); + + list_for_each_entry_safe(tl, tn, &free, link) + __intel_timeline_free(&tl->kref); - return !list_empty(&i915->gt.active_rings); + return !list_empty(&timelines->active_list); } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 313df3c37158..22e506e960e0 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -223,9 +223,6 @@ struct i915_request { /** timeline->request entry for this request */ struct list_head link; - /** ring->request_list entry for this request */ - struct list_head ring_link; - struct drm_i915_file_private *file_priv; /** file_priv list entry for this request */ struct list_head client_link; From patchwork Fri Jul 26 08:46:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060599 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C363B13B1 for ; Fri, 26 Jul 2019 09:04:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B197F28A68 for ; Fri, 26 Jul 2019 09:04:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A468B28A6A; Fri, 26 Jul 2019 09:04:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1EB3E289F9 for ; Fri, 26 Jul 2019 09:04:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7FEF76ECC0; Fri, 26 Jul 2019 09:04:15 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8D48B6ECBD for ; Fri, 26 Jul 2019 09:04:03 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621097-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:08 +0100 Message-Id: <20190726084613.22129-22-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 22/27] drm/i915: Replace struct_mutex for batch pool serialisation X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Auld Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Switch to tracking activity via i915_active on individual nodes, only keeping a list of retired objects in the cache, and reaping the cache when the engine itself idles. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 58 +++--- drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 - .../gpu/drm/i915/gem/i915_gem_object_types.h | 1 - drivers/gpu/drm/i915/gem/i915_gem_pm.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine.h | 1 - drivers/gpu/drm/i915/gt/intel_engine_cs.c | 11 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_pool.c | 177 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_pool.h | 34 ++++ .../gpu/drm/i915/gt/intel_engine_pool_types.h | 29 +++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 6 +- drivers/gpu/drm/i915/gt/mock_engine.c | 3 + drivers/gpu/drm/i915/i915_debugfs.c | 66 ------- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132 ------------- drivers/gpu/drm/i915/i915_gem_batch_pool.h | 26 --- 16 files changed, 290 insertions(+), 263 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool.c create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool.h create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool_types.h delete mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c delete mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 4d68529a270e..690a79ae4dc8 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -73,6 +73,7 @@ obj-y += gt/ gt-y += \ gt/intel_breadcrumbs.o \ gt/intel_context.o \ + gt/intel_engine_pool.o \ gt/intel_engine_cs.o \ gt/intel_engine_heartbeat.o \ gt/intel_engine_pm.o \ @@ -127,7 +128,6 @@ i915-y += \ $(gem-y) \ i915_active.o \ i915_cmd_parser.o \ - i915_gem_batch_pool.o \ i915_gem_evict.o \ i915_gem_fence_reg.o \ i915_gem_gtt.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 44add172cdc8..19f0f21ee59e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -16,6 +16,7 @@ #include "gem/i915_gem_ioctls.h" #include "gt/intel_context.h" +#include "gt/intel_engine_pool.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" @@ -1141,25 +1142,26 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, unsigned int len) { struct reloc_cache *cache = &eb->reloc_cache; - struct drm_i915_gem_object *obj; + struct intel_engine_pool_node *pool; struct i915_request *rq; struct i915_vma *batch; u32 *cmd; int err; - obj = i915_gem_batch_pool_get(&eb->engine->batch_pool, PAGE_SIZE); - if (IS_ERR(obj)) - return PTR_ERR(obj); + pool = intel_engine_pool_get(&eb->engine->pool, PAGE_SIZE); + if (IS_ERR(pool)) + return PTR_ERR(pool); - cmd = i915_gem_object_pin_map(obj, + cmd = i915_gem_object_pin_map(pool->obj, cache->has_llc ? I915_MAP_FORCE_WB : I915_MAP_FORCE_WC); - i915_gem_object_unpin_pages(obj); - if (IS_ERR(cmd)) - return PTR_ERR(cmd); + if (IS_ERR(cmd)) { + err = PTR_ERR(cmd); + goto out_pool; + } - batch = i915_vma_instance(obj, vma->vm, NULL); + batch = i915_vma_instance(pool->obj, vma->vm, NULL); if (IS_ERR(batch)) { err = PTR_ERR(batch); goto err_unmap; @@ -1175,6 +1177,10 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, goto err_unpin; } + err = intel_engine_pool_mark_active(pool, rq); + if (err) + goto err_request; + err = reloc_move_to_gpu(rq, vma); if (err) goto err_request; @@ -1200,7 +1206,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, cache->rq_size = 0; /* Return with batch mapping (cmd) still pinned */ - return 0; + goto out_pool; skip_request: i915_request_skip(rq, err); @@ -1209,7 +1215,9 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, err_unpin: i915_vma_unpin(batch); err_unmap: - i915_gem_object_unpin_map(obj); + i915_gem_object_unpin_map(pool->obj); +out_pool: + intel_engine_pool_put(pool); return err; } @@ -1953,18 +1961,17 @@ static int i915_reset_gen7_sol_offsets(struct i915_request *rq) static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) { - struct drm_i915_gem_object *shadow_batch_obj; + struct intel_engine_pool_node *pool; struct i915_vma *vma; int err; - shadow_batch_obj = i915_gem_batch_pool_get(&eb->engine->batch_pool, - PAGE_ALIGN(eb->batch_len)); - if (IS_ERR(shadow_batch_obj)) - return ERR_CAST(shadow_batch_obj); + pool = intel_engine_pool_get(&eb->engine->pool, eb->batch_len); + if (IS_ERR(pool)) + return ERR_CAST(pool); err = intel_engine_cmd_parser(eb->engine, eb->batch->obj, - shadow_batch_obj, + pool->obj, eb->batch_start_offset, eb->batch_len, is_master); @@ -1973,12 +1980,12 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) vma = NULL; else vma = ERR_PTR(err); - goto out; + goto err; } - vma = i915_gem_object_ggtt_pin(shadow_batch_obj, NULL, 0, 0, 0); + vma = i915_gem_object_ggtt_pin(pool->obj, NULL, 0, 0, 0); if (IS_ERR(vma)) - goto out; + goto err; eb->vma[eb->buffer_count] = i915_vma_get(vma); eb->flags[eb->buffer_count] = @@ -1986,8 +1993,11 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) vma->exec_flags = &eb->flags[eb->buffer_count]; eb->buffer_count++; -out: - i915_gem_object_unpin_pages(shadow_batch_obj); + vma->private = pool; + return vma; + +err: + intel_engine_pool_put(pool); return vma; } @@ -2612,6 +2622,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, * to explicitly hold another reference here. */ eb.request->batch = eb.batch; + if (eb.batch->private) + intel_engine_pool_mark_active(eb.batch->private, eb.request); trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb); @@ -2636,6 +2648,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_batch_unpin: if (eb.batch_flags & I915_DISPATCH_SECURE) i915_vma_unpin(eb.batch); + if (eb.batch->private) + intel_engine_pool_put(eb.batch->private); err_vma: if (eb.exec) eb_release_vmas(&eb); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 4ea97fca9c35..eccd7f4768f8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -66,7 +66,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, INIT_LIST_HEAD(&obj->mm.link); INIT_LIST_HEAD(&obj->lut_list); - INIT_LIST_HEAD(&obj->batch_pool_link); init_rcu_head(&obj->rcu); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 34b51fad02de..d474c6ac4100 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -114,7 +114,6 @@ struct drm_i915_gem_object { unsigned int userfault_count; struct list_head userfault_link; - struct list_head batch_pool_link; I915_SELFTEST_DECLARE(struct list_head st_link); /* diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 392a1e4fdd76..b82b4dd7a3d5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -34,10 +34,8 @@ static void i915_gem_park(struct drm_i915_private *i915) lockdep_assert_held(&i915->drm.struct_mutex); - for_each_engine(engine, i915, id) { + for_each_engine(engine, i915, id) call_idle_barriers(engine); /* cleanup after wedging */ - i915_gem_batch_pool_fini(&engine->batch_pool); - } i915_vma_parked(i915); diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index f0ad33929987..a9173d4f0829 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -9,7 +9,6 @@ #include #include -#include "i915_gem_batch_pool.h" #include "i915_pmu.h" #include "i915_reg.h" #include "i915_request.h" diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 7a7e385bb979..529497063ccf 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -32,6 +32,7 @@ #include "intel_engine.h" #include "intel_engine_pm.h" +#include "intel_engine_pool.h" #include "intel_engine_user.h" #include "intel_context.h" #include "intel_lrc.h" @@ -494,11 +495,6 @@ int intel_engines_init(struct drm_i915_private *i915) return err; } -static void intel_engine_init_batch_pool(struct intel_engine_cs *engine) -{ - i915_gem_batch_pool_init(&engine->batch_pool, engine); -} - void intel_engine_init_execlists(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = &engine->execlists; @@ -623,10 +619,11 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) intel_engine_init_active(engine, ENGINE_PHYSICAL); intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); - intel_engine_init_batch_pool(engine); intel_engine_init_cmd_parser(engine); intel_engine_init__pm(engine); + intel_engine_pool_init(&engine->pool); + /* Use the whole device by default */ engine->sseu = intel_sseu_from_device_info(&RUNTIME_INFO(engine->i915)->sseu); @@ -868,9 +865,9 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) cleanup_status_page(engine); + intel_engine_pool_fini(&engine->pool); intel_engine_fini_breadcrumbs(engine); intel_engine_cleanup_cmd_parser(engine); - i915_gem_batch_pool_fini(&engine->batch_pool); if (engine->default_state) i915_gem_object_put(engine->default_state); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 7bbc25631979..4488cf5de0e8 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -9,6 +9,7 @@ #include "intel_engine.h" #include "intel_engine_heartbeat.h" #include "intel_engine_pm.h" +#include "intel_engine_pool.h" #include "intel_gt.h" #include "intel_gt_pm.h" @@ -119,6 +120,7 @@ static int __engine_park(struct intel_wakeref *wf) intel_engine_park_heartbeat(engine); intel_engine_disarm_breadcrumbs(engine); + intel_engine_pool_park(&engine->pool); /* Must be reset upon idling, or we may miss the busy wakeup. */ GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c new file mode 100644 index 000000000000..03d90b49584a --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -0,0 +1,177 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#include "gem/i915_gem_object.h" + +#include "i915_drv.h" +#include "intel_engine_pm.h" +#include "intel_engine_pool.h" + +static struct intel_engine_cs *to_engine(struct intel_engine_pool *pool) +{ + return container_of(pool, struct intel_engine_cs, pool); +} + +static struct list_head * +bucket_for_size(struct intel_engine_pool *pool, size_t sz) +{ + int n; + + /* + * Compute a power-of-two bucket, but throw everything greater than + * 16KiB into the same bucket: i.e. the buckets hold objects of + * (1 page, 2 pages, 4 pages, 8+ pages). + */ + n = fls(sz >> PAGE_SHIFT) - 1; + if (n >= ARRAY_SIZE(pool->cache_list)) + n = ARRAY_SIZE(pool->cache_list) - 1; + + return &pool->cache_list[n]; +} + +static void node_free(struct intel_engine_pool_node *node) +{ + i915_gem_object_put(node->obj); + i915_active_fini(&node->active); + kfree(node); +} + +static int pool_active(struct i915_active *ref) +{ + struct intel_engine_pool_node *node = + container_of(ref, typeof(*node), active); + struct reservation_object *resv = node->obj->base.resv; + int err; + + if (reservation_object_trylock(resv)) { + reservation_object_add_excl_fence(resv, NULL); + reservation_object_unlock(resv); + } + + err = i915_gem_object_pin_pages(node->obj); + if (err) + return err; + + /* Hide this pinned object from the shrinker until retired */ + i915_gem_object_make_unshrinkable(node->obj); + + return 0; +} + +static void pool_retire(struct i915_active *ref) +{ + struct intel_engine_pool_node *node = + container_of(ref, typeof(*node), active); + struct intel_engine_pool *pool = node->pool; + struct list_head *list = bucket_for_size(pool, node->obj->base.size); + unsigned long flags; + + GEM_BUG_ON(!intel_engine_pm_is_awake(to_engine(pool))); + + i915_gem_object_unpin_pages(node->obj); + + /* Return this object to the shrinker pool */ + i915_gem_object_make_purgeable(node->obj); + + spin_lock_irqsave(&pool->lock, flags); + list_add(&node->link, list); + spin_unlock_irqrestore(&pool->lock, flags); +} + +static struct intel_engine_pool_node * +node_create(struct intel_engine_pool *pool, size_t sz) +{ + struct intel_engine_cs *engine = to_engine(pool); + struct intel_engine_pool_node *node; + struct drm_i915_gem_object *obj; + + node = kmalloc(sizeof(*node), + GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN); + if (!node) + return ERR_PTR(-ENOMEM); + + node->pool = pool; + i915_active_init(engine->i915, &node->active, pool_active, pool_retire); + + obj = i915_gem_object_create_internal(engine->i915, sz); + if (IS_ERR(obj)) { + i915_active_fini(&node->active); + kfree(node); + return ERR_CAST(obj); + } + + node->obj = obj; + return node; +} + +struct intel_engine_pool_node * +intel_engine_pool_get(struct intel_engine_pool *pool, size_t size) +{ + struct intel_engine_pool_node *node; + struct list_head *list; + unsigned long flags; + int ret; + + GEM_BUG_ON(!intel_engine_pm_is_awake(to_engine(pool))); + + size = PAGE_ALIGN(size); + list = bucket_for_size(pool, size); + + spin_lock_irqsave(&pool->lock, flags); + list_for_each_entry(node, list, link) { + if (node->obj->base.size < size) + continue; + list_del(&node->link); + break; + } + spin_unlock_irqrestore(&pool->lock, flags); + + if (&node->link == list) { + node = node_create(pool, size); + if (IS_ERR(node)) + return node; + } + + ret = i915_active_acquire(&node->active); + if (ret) { + node_free(node); + return ERR_PTR(ret); + } + + return node; +} + +void intel_engine_pool_init(struct intel_engine_pool *pool) +{ + int n; + + spin_lock_init(&pool->lock); + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) + INIT_LIST_HEAD(&pool->cache_list[n]); +} + +void intel_engine_pool_park(struct intel_engine_pool *pool) +{ + int n; + + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) { + struct list_head *list = &pool->cache_list[n]; + struct intel_engine_pool_node *node, *nn; + + list_for_each_entry_safe(node, nn, list, link) + node_free(node); + + INIT_LIST_HEAD(list); + } +} + +void intel_engine_pool_fini(struct intel_engine_pool *pool) +{ + int n; + + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) + GEM_BUG_ON(!list_empty(&pool->cache_list[n])); +} diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.h b/drivers/gpu/drm/i915/gt/intel_engine_pool.h new file mode 100644 index 000000000000..f7a0a660c1c9 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.h @@ -0,0 +1,34 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#ifndef INTEL_ENGINE_POOL_H +#define INTEL_ENGINE_POOL_H + +#include "intel_engine_pool_types.h" +#include "i915_active.h" +#include "i915_request.h" + +struct intel_engine_pool_node * +intel_engine_pool_get(struct intel_engine_pool *pool, size_t size); + +static inline int +intel_engine_pool_mark_active(struct intel_engine_pool_node *node, + struct i915_request *rq) +{ + return i915_active_ref(&node->active, rq->fence.context, rq); +} + +static inline void +intel_engine_pool_put(struct intel_engine_pool_node *node) +{ + i915_active_release(&node->active); +} + +void intel_engine_pool_init(struct intel_engine_pool *pool); +void intel_engine_pool_park(struct intel_engine_pool *pool); +void intel_engine_pool_fini(struct intel_engine_pool *pool); + +#endif /* INTEL_ENGINE_POOL_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h b/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h new file mode 100644 index 000000000000..e31ee361b76f --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h @@ -0,0 +1,29 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#ifndef INTEL_ENGINE_POOL_TYPES_H +#define INTEL_ENGINE_POOL_TYPES_H + +#include +#include + +#include "i915_active_types.h" + +struct drm_i915_gem_object; + +struct intel_engine_pool { + spinlock_t lock; + struct list_head cache_list[4]; +}; + +struct intel_engine_pool_node { + struct i915_active active; + struct drm_i915_gem_object *obj; + struct list_head link; + struct intel_engine_pool *pool; +}; + +#endif /* INTEL_ENGINE_POOL_TYPES_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index d8b82b6f7db9..8230593c9f53 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -18,12 +18,12 @@ #include #include "i915_gem.h" -#include "i915_gem_batch_pool.h" #include "i915_pmu.h" #include "i915_priolist_types.h" #include "i915_selftest.h" -#include "gt/intel_timeline_types.h" +#include "intel_engine_pool_types.h" #include "intel_sseu.h" +#include "intel_timeline_types.h" #include "intel_wakeref.h" #include "intel_workarounds_types.h" @@ -353,7 +353,7 @@ struct intel_engine_cs { * when the command parser is enabled. Prevents the client from * modifying the batch contents after software parsing. */ - struct i915_gem_batch_pool batch_pool; + struct intel_engine_pool pool; struct intel_hw_status_page status_page; struct i915_ctx_workarounds wa_ctx; diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 3ed511e8e098..0d90d72d0d5c 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -27,6 +27,7 @@ #include "i915_drv.h" #include "intel_context.h" #include "intel_engine_pm.h" +#include "intel_engine_pool.h" #include "mock_engine.h" #include "selftests/mock_request.h" @@ -288,6 +289,8 @@ int mock_engine_init(struct intel_engine_cs *engine) intel_engine_init_execlists(engine); intel_engine_init__pm(engine); + intel_engine_pool_init(&engine->pool); + engine->kernel_context = i915_gem_context_get_engine(i915->kernel_context, engine->id); if (IS_ERR(engine->kernel_context)) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index a51a0c8e8a83..3d04e04441db 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -295,26 +295,6 @@ static int per_file_stats(int id, void *ptr, void *data) stats.closed); \ } while (0) -static void print_batch_pool_stats(struct seq_file *m, - struct drm_i915_private *dev_priv) -{ - struct drm_i915_gem_object *obj; - struct intel_engine_cs *engine; - struct file_stats stats = {}; - int j; - - for_each_user_engine(engine, dev_priv) { - for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) - per_file_stats(0, obj, &stats); - } - } - - print_file_stats(m, "[k]batch pool", stats); -} - static void print_context_stats(struct seq_file *m, struct drm_i915_private *i915) { @@ -377,57 +357,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) if (ret) return ret; - print_batch_pool_stats(m, i915); print_context_stats(m, i915); mutex_unlock(&i915->drm.struct_mutex); return 0; } -static int i915_gem_batch_pool_info(struct seq_file *m, void *data) -{ - struct drm_i915_private *dev_priv = node_to_i915(m->private); - struct drm_device *dev = &dev_priv->drm; - struct drm_i915_gem_object *obj; - struct intel_engine_cs *engine; - int total = 0; - int ret, j; - - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; - - for_each_user_engine(engine, dev_priv) { - for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { - int count; - - count = 0; - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) - count++; - seq_printf(m, "%s cache[%d]: %d objects\n", - engine->name, j, count); - - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) { - seq_puts(m, " "); - describe_obj(m, obj); - seq_putc(m, '\n'); - } - - total += count; - } - } - - seq_printf(m, "total: %d\n", total); - - mutex_unlock(&dev->struct_mutex); - - return 0; -} - static void gen8_display_interrupt_info(struct seq_file *m) { struct drm_i915_private *dev_priv = node_to_i915(m->private); @@ -4295,7 +4230,6 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_fence_regs", i915_gem_fence_regs_info, 0}, {"i915_gem_interrupt", i915_interrupt_info, 0}, - {"i915_gem_batch_pool", i915_gem_batch_pool_info, 0}, {"i915_guc_info", i915_guc_info, 0}, {"i915_guc_load_status", i915_guc_load_status_info, 0}, {"i915_guc_log_dump", i915_guc_log_dump, 0}, diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c deleted file mode 100644 index b17f23991253..000000000000 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c +++ /dev/null @@ -1,132 +0,0 @@ -/* - * SPDX-License-Identifier: MIT - * - * Copyright © 2014-2018 Intel Corporation - */ - -#include "i915_gem_batch_pool.h" -#include "i915_drv.h" - -/** - * DOC: batch pool - * - * In order to submit batch buffers as 'secure', the software command parser - * must ensure that a batch buffer cannot be modified after parsing. It does - * this by copying the user provided batch buffer contents to a kernel owned - * buffer from which the hardware will actually execute, and by carefully - * managing the address space bindings for such buffers. - * - * The batch pool framework provides a mechanism for the driver to manage a - * set of scratch buffers to use for this purpose. The framework can be - * extended to support other uses cases should they arise. - */ - -/** - * i915_gem_batch_pool_init() - initialize a batch buffer pool - * @pool: the batch buffer pool - * @engine: the associated request submission engine - */ -void i915_gem_batch_pool_init(struct i915_gem_batch_pool *pool, - struct intel_engine_cs *engine) -{ - int n; - - pool->engine = engine; - - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) - INIT_LIST_HEAD(&pool->cache_list[n]); -} - -/** - * i915_gem_batch_pool_fini() - clean up a batch buffer pool - * @pool: the pool to clean up - * - * Note: Callers must hold the struct_mutex. - */ -void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) -{ - int n; - - lockdep_assert_held(&pool->engine->i915->drm.struct_mutex); - - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) { - struct drm_i915_gem_object *obj, *next; - - list_for_each_entry_safe(obj, next, - &pool->cache_list[n], - batch_pool_link) - i915_gem_object_put(obj); - - INIT_LIST_HEAD(&pool->cache_list[n]); - } -} - -/** - * i915_gem_batch_pool_get() - allocate a buffer from the pool - * @pool: the batch buffer pool - * @size: the minimum desired size of the returned buffer - * - * Returns an inactive buffer from @pool with at least @size bytes, - * with the pages pinned. The caller must i915_gem_object_unpin_pages() - * on the returned object. - * - * Note: Callers must hold the struct_mutex - * - * Return: the buffer object or an error pointer - */ -struct drm_i915_gem_object * -i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, - size_t size) -{ - struct drm_i915_gem_object *obj; - struct list_head *list; - int n, ret; - - lockdep_assert_held(&pool->engine->i915->drm.struct_mutex); - - /* Compute a power-of-two bucket, but throw everything greater than - * 16KiB into the same bucket: i.e. the the buckets hold objects of - * (1 page, 2 pages, 4 pages, 8+ pages). - */ - n = fls(size >> PAGE_SHIFT) - 1; - if (n >= ARRAY_SIZE(pool->cache_list)) - n = ARRAY_SIZE(pool->cache_list) - 1; - list = &pool->cache_list[n]; - - list_for_each_entry(obj, list, batch_pool_link) { - struct reservation_object *resv = obj->base.resv; - - /* The batches are strictly LRU ordered */ - if (!reservation_object_test_signaled_rcu(resv, true)) - break; - - /* - * The object is now idle, clear the array of shared - * fences before we add a new request. Although, we - * remain on the same engine, we may be on a different - * timeline and so may continually grow the array, - * trapping a reference to all the old fences, rather - * than replace the existing fence. - */ - if (rcu_access_pointer(resv->fence)) { - reservation_object_lock(resv, NULL); - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); - } - - if (obj->base.size >= size) - goto found; - } - - obj = i915_gem_object_create_internal(pool->engine->i915, size); - if (IS_ERR(obj)) - return obj; - -found: - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ERR_PTR(ret); - - list_move_tail(&obj->batch_pool_link, list); - return obj; -} diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.h b/drivers/gpu/drm/i915/i915_gem_batch_pool.h deleted file mode 100644 index feeeeeaa54d8..000000000000 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.h +++ /dev/null @@ -1,26 +0,0 @@ -/* - * SPDX-License-Identifier: MIT - * - * Copyright © 2014-2018 Intel Corporation - */ - -#ifndef I915_GEM_BATCH_POOL_H -#define I915_GEM_BATCH_POOL_H - -#include - -struct drm_i915_gem_object; -struct intel_engine_cs; - -struct i915_gem_batch_pool { - struct intel_engine_cs *engine; - struct list_head cache_list[4]; -}; - -void i915_gem_batch_pool_init(struct i915_gem_batch_pool *pool, - struct intel_engine_cs *engine); -void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); -struct drm_i915_gem_object * -i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, size_t size); - -#endif /* I915_GEM_BATCH_POOL_H */ From patchwork Fri Jul 26 08:46:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060597 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E277613B1 for ; Fri, 26 Jul 2019 09:04:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D4A6228A69 for ; Fri, 26 Jul 2019 09:04:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C936C28A73; Fri, 26 Jul 2019 09:04:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 61BA328A69 for ; Fri, 26 Jul 2019 09:04:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 905BF6ECBD; Fri, 26 Jul 2019 09:04:09 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 893BB6ECBD for ; Fri, 26 Jul 2019 09:04:04 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621098-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:09 +0100 Message-Id: <20190726084613.22129-23-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 23/27] drm/i915/gt: Mark context->active_count as protected by timeline->mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We use timeline->mutex to protect modifications to context->active_count, and the associated enable/disable callbacks. Due to complications with engine-pm barrier there is a path where we used a "superlock" to provide serialised protect and so could not unconditionally assert with lockdep that it was always held. However, we can mark the mutex as taken (noting that we may be nested underneath ourselves) which means we can be reassured the right timeline->mutex is always treated as held and let lockdep roam free. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_context.h | 3 +++ drivers/gpu/drm/i915/gt/intel_context_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 12 ++++++++++++ drivers/gpu/drm/i915/gt/intel_timeline.c | 4 ++++ 4 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index cb63fc7b1b18..a89515aabecd 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -89,17 +89,20 @@ void intel_context_exit_engine(struct intel_context *ce); static inline void intel_context_enter(struct intel_context *ce) { + lockdep_assert_held(&ce->ring->timeline->mutex); if (!ce->active_count++) ce->ops->enter(ce); } static inline void intel_context_mark_active(struct intel_context *ce) { + lockdep_assert_held(&ce->ring->timeline->mutex); ++ce->active_count; } static inline void intel_context_exit(struct intel_context *ce) { + lockdep_assert_held(&ce->ring->timeline->mutex); GEM_BUG_ON(!ce->active_count); if (!--ce->active_count) ce->ops->exit(ce); diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 68a7e979b1a9..92afc207ec80 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -55,7 +55,7 @@ struct intel_context { u32 *lrc_reg_state; u64 lrc_desc; - unsigned int active_count; /* notionally protected by timeline->mutex */ + unsigned int active_count; /* protected by timeline->mutex */ atomic_t pin_count; struct mutex pin_mutex; /* guards pinning and associated on-gpuing */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 4488cf5de0e8..cc971716129b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -60,6 +60,16 @@ void intel_engine_park(struct intel_engine_cs *engine) } } +static inline void __timeline_mark_lock(struct intel_context *ce) +{ + mutex_acquire(&ce->ring->timeline->mutex.dep_map, 2, 0, _THIS_IP_); +} + +static inline void __timeline_mark_unlock(struct intel_context *ce) +{ + mutex_release(&ce->ring->timeline->mutex.dep_map, 0, _THIS_IP_); +} + static bool switch_to_kernel_context(struct intel_engine_cs *engine) { struct i915_request *rq; @@ -84,6 +94,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) * retiring the last request, thus all rings should be empty and * all timelines idle. */ + __timeline_mark_lock(engine->kernel_context); rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT); if (IS_ERR(rq)) /* Context switch failed, hope for the best! Maybe reset? */ @@ -95,6 +106,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) i915_request_add_barriers(rq); __i915_request_commit(rq); + __timeline_mark_unlock(engine->kernel_context); return false; } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 7b476cd55dac..eafd94d5e211 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -338,6 +338,8 @@ void intel_timeline_enter(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; + lockdep_assert_held(&tl->mutex); + GEM_BUG_ON(!atomic_read(&tl->pin_count)); if (tl->active_count++) return; @@ -352,6 +354,8 @@ void intel_timeline_exit(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; + lockdep_assert_held(&tl->mutex); + GEM_BUG_ON(!tl->active_count); if (--tl->active_count) return; From patchwork Fri Jul 26 08:46:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060595 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 55792746 for ; Fri, 26 Jul 2019 09:04:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4633028A74 for ; Fri, 26 Jul 2019 09:04:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A9A528A5C; Fri, 26 Jul 2019 09:04:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C61CC28A6A for ; Fri, 26 Jul 2019 09:04:09 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4B6386E8C4; Fri, 26 Jul 2019 09:04:09 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 66B816E8C5 for ; Fri, 26 Jul 2019 09:04:04 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621099-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:10 +0100 Message-Id: <20190726084613.22129-24-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 24/27] drm/i915: Forgo last_fence active request tracking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Auld Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We were using the last_fence to track the last request that used this vma that might be interpreted by a fence register and forced ourselves to wait for this request before modifying any fence register that overlapped our vma. Due to requirement that we need to track any XY_BLT command, linear or tiled, this in effect meant that we have to track the vma for its active lifespan anyway, so we can forgo the explicit last_fence tracking and just use the whole vma->active. Another solution would be to pipeline the register updates, and would help resolve some long running stalls for gen3 (but only gen 2 and 3!) Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_debugfs.c | 4 +--- drivers/gpu/drm/i915/i915_gem_fence_reg.c | 6 ++---- drivers/gpu/drm/i915/i915_gem_gtt.c | 1 - drivers/gpu/drm/i915/i915_vma.c | 13 ------------- drivers/gpu/drm/i915/i915_vma.h | 1 - 5 files changed, 3 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 3d04e04441db..afa22ce209a2 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -210,9 +210,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) } } if (vma->fence) - seq_printf(m, " , fence: %d%s", - vma->fence->id, - i915_active_request_isset(&vma->last_fence) ? "*" : ""); + seq_printf(m, " , fence: %d", vma->fence->id); seq_puts(m, ")"); spin_lock(&obj->vma.lock); diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c index bcac359ec661..c9654f1a468f 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c +++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c @@ -230,16 +230,14 @@ static int fence_update(struct i915_fence_reg *fence, i915_gem_object_get_tiling(vma->obj))) return -EINVAL; - ret = i915_active_request_retire(&vma->last_fence, - &vma->obj->base.dev->struct_mutex); + ret = i915_active_wait(&vma->active); if (ret) return ret; } old = xchg(&fence->vma, NULL); if (old) { - ret = i915_active_request_retire(&old->last_fence, - &old->obj->base.dev->struct_mutex); + ret = i915_active_wait(&old->active); if (ret) { fence->vma = old; return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 8304b98b0bf8..2847d71223de 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1868,7 +1868,6 @@ static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size) return ERR_PTR(-ENOMEM); i915_active_init(i915, &vma->active, NULL, NULL); - INIT_ACTIVE_REQUEST(&vma->last_fence); vma->vm = &ggtt->vm; vma->ops = &pd_vma_ops; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index b52f71e0ade6..85a8e6fd34d5 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -119,7 +119,6 @@ vma_create(struct drm_i915_gem_object *obj, i915_active_init(vm->i915, &vma->active, __i915_vma_active, __i915_vma_retire); - INIT_ACTIVE_REQUEST(&vma->last_fence); /* Declare ourselves safe for use inside shrinkers */ if (IS_ENABLED(CONFIG_LOCKDEP)) { @@ -801,8 +800,6 @@ static void __i915_vma_destroy(struct i915_vma *vma) GEM_BUG_ON(vma->node.allocated); GEM_BUG_ON(vma->fence); - GEM_BUG_ON(i915_active_request_isset(&vma->last_fence)); - mutex_lock(&vma->vm->mutex); list_del(&vma->vm_link); mutex_unlock(&vma->vm->mutex); @@ -938,9 +935,6 @@ int i915_vma_move_to_active(struct i915_vma *vma, obj->read_domains |= I915_GEM_GPU_DOMAINS; obj->mm.dirty = true; - if (flags & EXEC_OBJECT_NEEDS_FENCE) - __i915_active_request_set(&vma->last_fence, rq); - export_fence(vma, rq, flags); GEM_BUG_ON(!i915_vma_is_active(vma)); @@ -973,14 +967,7 @@ int i915_vma_unbind(struct i915_vma *vma) * before we are finished). */ __i915_vma_pin(vma); - ret = i915_active_wait(&vma->active); - if (ret) - goto unpin; - - ret = i915_active_request_retire(&vma->last_fence, - &vma->vm->i915->drm.struct_mutex); -unpin: __i915_vma_unpin(vma); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 5c4224749bde..b3d2121be947 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -111,7 +111,6 @@ struct i915_vma { #define I915_VMA_GGTT_WRITE BIT(14) struct i915_active active; - struct i915_active_request last_fence; /** * Support different GGTT views into the same object. From patchwork Fri Jul 26 08:46:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060549 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79A7113B1 for ; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6999C28894 for ; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E46D28A5B; Fri, 26 Jul 2019 08:46:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BF6C828894 for ; Fri, 26 Jul 2019 08:46:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4815C6E8B7; Fri, 26 Jul 2019 08:46:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7E20A6E8A9 for ; Fri, 26 Jul 2019 08:46:27 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621100-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:11 +0100 Message-Id: <20190726084613.22129-25-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 25/27] drm/i915/overlay: Switch to using i915_active tracking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Remove the raw i915_active_request tracking in favour of the higher level i915_active tracking for the sole purpose of making the lockless transition easier in later patches. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_overlay.c | 129 +++++++++---------- drivers/gpu/drm/i915/i915_active.h | 19 --- 2 files changed, 64 insertions(+), 84 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c index 07929726b780..9b3f4f6644f3 100644 --- a/drivers/gpu/drm/i915/display/intel_overlay.c +++ b/drivers/gpu/drm/i915/display/intel_overlay.c @@ -191,7 +191,8 @@ struct intel_overlay { struct overlay_registers __iomem *regs; u32 flip_addr; /* flip handling */ - struct i915_active_request last_flip; + struct i915_active last_flip; + void (*flip_complete)(struct intel_overlay *ovl); }; static void i830_overlay_clock_gating(struct drm_i915_private *dev_priv, @@ -217,30 +218,25 @@ static void i830_overlay_clock_gating(struct drm_i915_private *dev_priv, PCI_DEVFN(0, 0), I830_CLOCK_GATE, val); } -static void intel_overlay_submit_request(struct intel_overlay *overlay, - struct i915_request *rq, - i915_active_retire_fn retire) +static struct i915_request * +alloc_request(struct intel_overlay *overlay, void (*fn)(struct intel_overlay *)) { - GEM_BUG_ON(i915_active_request_peek(&overlay->last_flip, - &overlay->i915->drm.struct_mutex)); - i915_active_request_set_retire_fn(&overlay->last_flip, retire, - &overlay->i915->drm.struct_mutex); - __i915_active_request_set(&overlay->last_flip, rq); - i915_request_add(rq); -} + struct i915_request *rq; + int err; -static int intel_overlay_do_wait_request(struct intel_overlay *overlay, - struct i915_request *rq, - i915_active_retire_fn retire) -{ - intel_overlay_submit_request(overlay, rq, retire); - return i915_active_request_retire(&overlay->last_flip, - &overlay->i915->drm.struct_mutex); -} + overlay->flip_complete = fn; -static struct i915_request *alloc_request(struct intel_overlay *overlay) -{ - return i915_request_create(overlay->context); + rq = i915_request_create(overlay->context); + if (IS_ERR(rq)) + return rq; + + err = i915_active_ref(&overlay->last_flip, rq->fence.context, rq); + if (err) { + i915_request_add(rq); + return ERR_PTR(err); + } + + return rq; } /* overlay needs to be disable in OCMD reg */ @@ -252,7 +248,7 @@ static int intel_overlay_on(struct intel_overlay *overlay) WARN_ON(overlay->active); - rq = alloc_request(overlay); + rq = alloc_request(overlay, NULL); if (IS_ERR(rq)) return PTR_ERR(rq); @@ -273,7 +269,9 @@ static int intel_overlay_on(struct intel_overlay *overlay) *cs++ = MI_NOOP; intel_ring_advance(rq, cs); - return intel_overlay_do_wait_request(overlay, rq, NULL); + i915_request_add(rq); + + return i915_active_wait(&overlay->last_flip); } static void intel_overlay_flip_prepare(struct intel_overlay *overlay, @@ -317,7 +315,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay, if (tmp & (1 << 17)) DRM_DEBUG("overlay underrun, DOVSTA: %x\n", tmp); - rq = alloc_request(overlay); + rq = alloc_request(overlay, NULL); if (IS_ERR(rq)) return PTR_ERR(rq); @@ -332,8 +330,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay, intel_ring_advance(rq, cs); intel_overlay_flip_prepare(overlay, vma); - - intel_overlay_submit_request(overlay, rq, NULL); + i915_request_add(rq); return 0; } @@ -354,20 +351,13 @@ static void intel_overlay_release_old_vma(struct intel_overlay *overlay) } static void -intel_overlay_release_old_vid_tail(struct i915_active_request *active, - struct i915_request *rq) +intel_overlay_release_old_vid_tail(struct intel_overlay *overlay) { - struct intel_overlay *overlay = - container_of(active, typeof(*overlay), last_flip); - intel_overlay_release_old_vma(overlay); } -static void intel_overlay_off_tail(struct i915_active_request *active, - struct i915_request *rq) +static void intel_overlay_off_tail(struct intel_overlay *overlay) { - struct intel_overlay *overlay = - container_of(active, typeof(*overlay), last_flip); struct drm_i915_private *dev_priv = overlay->i915; intel_overlay_release_old_vma(overlay); @@ -380,6 +370,16 @@ static void intel_overlay_off_tail(struct i915_active_request *active, i830_overlay_clock_gating(dev_priv, true); } +static void +intel_overlay_last_flip_retire(struct i915_active *active) +{ + struct intel_overlay *overlay = + container_of(active, typeof(*overlay), last_flip); + + if (overlay->flip_complete) + overlay->flip_complete(overlay); +} + /* overlay needs to be disabled in OCMD reg */ static int intel_overlay_off(struct intel_overlay *overlay) { @@ -394,7 +394,7 @@ static int intel_overlay_off(struct intel_overlay *overlay) * of the hw. Do it in both cases */ flip_addr |= OFC_UPDATE; - rq = alloc_request(overlay); + rq = alloc_request(overlay, intel_overlay_off_tail); if (IS_ERR(rq)) return PTR_ERR(rq); @@ -417,17 +417,16 @@ static int intel_overlay_off(struct intel_overlay *overlay) intel_ring_advance(rq, cs); intel_overlay_flip_prepare(overlay, NULL); + i915_request_add(rq); - return intel_overlay_do_wait_request(overlay, rq, - intel_overlay_off_tail); + return i915_active_wait(&overlay->last_flip); } /* recover from an interruption due to a signal * We have to be careful not to repeat work forever an make forward progess. */ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay) { - return i915_active_request_retire(&overlay->last_flip, - &overlay->i915->drm.struct_mutex); + return i915_active_wait(&overlay->last_flip); } /* Wait for pending overlay flip and release old frame. @@ -437,43 +436,40 @@ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay) static int intel_overlay_release_old_vid(struct intel_overlay *overlay) { struct drm_i915_private *dev_priv = overlay->i915; + struct i915_request *rq; u32 *cs; - int ret; lockdep_assert_held(&dev_priv->drm.struct_mutex); - /* Only wait if there is actually an old frame to release to + /* + * Only wait if there is actually an old frame to release to * guarantee forward progress. */ if (!overlay->old_vma) return 0; - if (I915_READ(GEN2_ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) { - /* synchronous slowpath */ - struct i915_request *rq; + if (!(I915_READ(GEN2_ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT)) { + intel_overlay_release_old_vid_tail(overlay); + return 0; + } - rq = alloc_request(overlay); - if (IS_ERR(rq)) - return PTR_ERR(rq); + rq = alloc_request(overlay, intel_overlay_release_old_vid_tail); + if (IS_ERR(rq)) + return PTR_ERR(rq); - cs = intel_ring_begin(rq, 2); - if (IS_ERR(cs)) { - i915_request_add(rq); - return PTR_ERR(cs); - } + cs = intel_ring_begin(rq, 2); + if (IS_ERR(cs)) { + i915_request_add(rq); + return PTR_ERR(cs); + } - *cs++ = MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP; - *cs++ = MI_NOOP; - intel_ring_advance(rq, cs); + *cs++ = MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP; + *cs++ = MI_NOOP; + intel_ring_advance(rq, cs); - ret = intel_overlay_do_wait_request(overlay, rq, - intel_overlay_release_old_vid_tail); - if (ret) - return ret; - } else - intel_overlay_release_old_vid_tail(&overlay->last_flip, NULL); + i915_request_add(rq); - return 0; + return i915_active_wait(&overlay->last_flip); } void intel_overlay_reset(struct drm_i915_private *dev_priv) @@ -1375,7 +1371,9 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv) overlay->contrast = 75; overlay->saturation = 146; - INIT_ACTIVE_REQUEST(&overlay->last_flip); + i915_active_init(dev_priv, + &overlay->last_flip, + NULL, intel_overlay_last_flip_retire); ret = get_registers(overlay, OVERLAY_NEEDS_PHYSICAL(dev_priv)); if (ret) @@ -1409,6 +1407,7 @@ void intel_overlay_cleanup(struct drm_i915_private *dev_priv) WARN_ON(overlay->active); i915_gem_object_put(overlay->reg_bo); + i915_active_fini(&overlay->last_flip); kfree(overlay); } diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index ba68b077ec6c..ad28b866eb87 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -89,25 +89,6 @@ int __must_check i915_active_request_set(struct i915_active_request *active, struct i915_request *rq); -/** - * i915_active_request_set_retire_fn - updates the retirement callback - * @active - the active tracker - * @fn - the routine called when the request is retired - * @mutex - struct_mutex used to guard retirements - * - * i915_active_request_set_retire_fn() updates the function pointer that - * is called when the final request associated with the @active tracker - * is retired. - */ -static inline void -i915_active_request_set_retire_fn(struct i915_active_request *active, - i915_active_retire_fn fn, - struct mutex *mutex) -{ - lockdep_assert_held(mutex); - active->retire = fn ?: i915_active_retire_noop; -} - /** * i915_active_request_raw - return the active request * @active - the active tracker From patchwork Fri Jul 26 08:46:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060535 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F75B13A0 for ; Fri, 26 Jul 2019 08:46:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C878205FC for ; Fri, 26 Jul 2019 08:46:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 308EE28A3D; Fri, 26 Jul 2019 08:46:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2952E205FC for ; Fri, 26 Jul 2019 08:46:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 63DF36E8A9; Fri, 26 Jul 2019 08:46:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id ABE936E8A9 for ; Fri, 26 Jul 2019 08:46:26 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621101-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:12 +0100 Message-Id: <20190726084613.22129-26-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 26/27] drm/i915: Extract intel_frontbuffer active tracking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Move the active tracking for the frontbuffer operations out of the i915_gem_object and into its own first class (refcounted) object. In the process of detangling, we switch from low level request tracking to the easier i915_active -- with the plan that this avoids any potential atomic callbacks as the frontbuffer tracking wishes to sleep as it flushes. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_display.c | 70 +++-- drivers/gpu/drm/i915/display/intel_fbdev.c | 40 ++- .../gpu/drm/i915/display/intel_frontbuffer.c | 255 +++++++++++++----- .../gpu/drm/i915/display/intel_frontbuffer.h | 70 +++-- drivers/gpu/drm/i915/display/intel_overlay.c | 8 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 4 - drivers/gpu/drm/i915/gem/i915_gem_object.c | 27 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 8 +- drivers/gpu/drm/i915/i915_debugfs.c | 5 - drivers/gpu/drm/i915/i915_drv.h | 4 - drivers/gpu/drm/i915/i915_gem.c | 47 +--- drivers/gpu/drm/i915/i915_vma.c | 6 +- drivers/gpu/drm/i915/intel_drv.h | 1 + 16 files changed, 306 insertions(+), 257 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index e25b82d07d4f..219ae5a60b31 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3049,12 +3049,13 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc, { struct drm_device *dev = crtc->base.dev; struct drm_i915_private *dev_priv = to_i915(dev); - struct drm_i915_gem_object *obj = NULL; struct drm_mode_fb_cmd2 mode_cmd = { 0 }; struct drm_framebuffer *fb = &plane_config->fb->base; u32 base_aligned = round_down(plane_config->base, PAGE_SIZE); u32 size_aligned = round_up(plane_config->base + plane_config->size, PAGE_SIZE); + struct drm_i915_gem_object *obj; + bool ret = false; size_aligned -= base_aligned; @@ -3096,7 +3097,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc, break; default: MISSING_CASE(plane_config->tiling); - return false; + goto out; } mode_cmd.pixel_format = fb->format->format; @@ -3108,16 +3109,15 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc, if (intel_framebuffer_init(to_intel_framebuffer(fb), obj, &mode_cmd)) { DRM_DEBUG_KMS("intel fb init failed\n"); - goto out_unref_obj; + goto out; } DRM_DEBUG_KMS("initial plane fb obj %p\n", obj); - return true; - -out_unref_obj: + ret = true; +out: i915_gem_object_put(obj); - return false; + return ret; } static void @@ -3174,6 +3174,12 @@ static void intel_plane_disable_noatomic(struct intel_crtc *crtc, intel_disable_plane(plane, crtc_state); } +static struct intel_frontbuffer * +to_intel_frontbuffer(struct drm_framebuffer *fb) +{ + return fb ? to_intel_framebuffer(fb)->frontbuffer : NULL; +} + static void intel_find_initial_plane_obj(struct intel_crtc *intel_crtc, struct intel_initial_plane_config *plane_config) @@ -3181,7 +3187,6 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc, struct drm_device *dev = intel_crtc->base.dev; struct drm_i915_private *dev_priv = to_i915(dev); struct drm_crtc *c; - struct drm_i915_gem_object *obj; struct drm_plane *primary = intel_crtc->base.primary; struct drm_plane_state *plane_state = primary->state; struct intel_plane *intel_plane = to_intel_plane(primary); @@ -3257,8 +3262,7 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc, return; } - obj = intel_fb_obj(fb); - intel_fb_obj_flush(obj, ORIGIN_DIRTYFB); + intel_frontbuffer_flush(to_intel_frontbuffer(fb), ORIGIN_DIRTYFB); plane_state->src_x = 0; plane_state->src_y = 0; @@ -3273,14 +3277,14 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc, intel_state->base.src = drm_plane_state_src(plane_state); intel_state->base.dst = drm_plane_state_dest(plane_state); - if (i915_gem_object_is_tiled(obj)) + if (plane_config->tiling) dev_priv->preserve_bios_swizzle = true; plane_state->fb = fb; plane_state->crtc = &intel_crtc->base; atomic_or(to_intel_plane(primary)->frontbuffer_bit, - &obj->frontbuffer_bits); + &to_intel_frontbuffer(fb)->bits); } static int skl_max_plane_width(const struct drm_framebuffer *fb, @@ -14129,9 +14133,9 @@ static void intel_atomic_track_fbs(struct intel_atomic_state *state) for_each_oldnew_intel_plane_in_state(state, plane, old_plane_state, new_plane_state, i) - i915_gem_track_fb(intel_fb_obj(old_plane_state->base.fb), - intel_fb_obj(new_plane_state->base.fb), - plane->frontbuffer_bit); + intel_frontbuffer_track(to_intel_frontbuffer(old_plane_state->base.fb), + to_intel_frontbuffer(new_plane_state->base.fb), + plane->frontbuffer_bit); } static int intel_atomic_commit(struct drm_device *dev, @@ -14415,7 +14419,7 @@ intel_prepare_plane_fb(struct drm_plane *plane, return ret; fb_obj_bump_render_priority(obj); - intel_fb_obj_flush(obj, ORIGIN_DIRTYFB); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_DIRTYFB); if (!new_state->fence) { /* implicit fencing */ struct dma_fence *fence; @@ -14678,13 +14682,12 @@ intel_legacy_cursor_update(struct drm_plane *plane, struct drm_modeset_acquire_ctx *ctx) { struct drm_i915_private *dev_priv = to_i915(crtc->dev); - int ret; struct drm_plane_state *old_plane_state, *new_plane_state; struct intel_plane *intel_plane = to_intel_plane(plane); - struct drm_framebuffer *old_fb; struct intel_crtc_state *crtc_state = to_intel_crtc_state(crtc->state); struct intel_crtc_state *new_crtc_state; + int ret; /* * When crtc is inactive or there is a modeset pending, @@ -14752,11 +14755,10 @@ intel_legacy_cursor_update(struct drm_plane *plane, if (ret) goto out_unlock; - intel_fb_obj_flush(intel_fb_obj(fb), ORIGIN_FLIP); - - old_fb = old_plane_state->fb; - i915_gem_track_fb(intel_fb_obj(old_fb), intel_fb_obj(fb), - intel_plane->frontbuffer_bit); + intel_frontbuffer_flush(to_intel_frontbuffer(fb), ORIGIN_FLIP); + intel_frontbuffer_track(to_intel_frontbuffer(old_plane_state->fb), + to_intel_frontbuffer(fb), + intel_plane->frontbuffer_bit); /* Swap plane state */ plane->state = new_plane_state; @@ -15536,15 +15538,9 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb) { struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb); - struct drm_i915_gem_object *obj = intel_fb_obj(fb); drm_framebuffer_cleanup(fb); - - i915_gem_object_lock(obj); - WARN_ON(!obj->framebuffer_references--); - i915_gem_object_unlock(obj); - - i915_gem_object_put(obj); + intel_frontbuffer_put(intel_fb->frontbuffer); kfree(intel_fb); } @@ -15572,7 +15568,7 @@ static int intel_user_framebuffer_dirty(struct drm_framebuffer *fb, struct drm_i915_gem_object *obj = intel_fb_obj(fb); i915_gem_object_flush_if_display(obj); - intel_fb_obj_flush(obj, ORIGIN_DIRTYFB); + intel_frontbuffer_flush(to_intel_frontbuffer(fb), ORIGIN_DIRTYFB); return 0; } @@ -15594,8 +15590,11 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb, int ret = -EINVAL; int i; + intel_fb->frontbuffer = intel_frontbuffer_get(obj); + if (!intel_fb->frontbuffer) + return -ENOMEM; + i915_gem_object_lock(obj); - obj->framebuffer_references++; tiling = i915_gem_object_get_tiling(obj); stride = i915_gem_object_get_stride(obj); i915_gem_object_unlock(obj); @@ -15712,9 +15711,7 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb, return 0; err: - i915_gem_object_lock(obj); - obj->framebuffer_references--; - i915_gem_object_unlock(obj); + intel_frontbuffer_put(intel_fb->frontbuffer); return ret; } @@ -15732,8 +15729,7 @@ intel_user_framebuffer_create(struct drm_device *dev, return ERR_PTR(-ENOENT); fb = intel_framebuffer_create(obj, &mode_cmd); - if (IS_ERR(fb)) - i915_gem_object_put(obj); + i915_gem_object_put(obj); return fb; } diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c index 1edd44ee32b2..4b57cdd76699 100644 --- a/drivers/gpu/drm/i915/display/intel_fbdev.c +++ b/drivers/gpu/drm/i915/display/intel_fbdev.c @@ -47,13 +47,14 @@ #include "intel_fbdev.h" #include "intel_frontbuffer.h" -static void intel_fbdev_invalidate(struct intel_fbdev *ifbdev) +static struct intel_frontbuffer *to_frontbuffer(struct intel_fbdev *ifbdev) { - struct drm_i915_gem_object *obj = intel_fb_obj(&ifbdev->fb->base); - unsigned int origin = - ifbdev->vma_flags & PLANE_HAS_FENCE ? ORIGIN_GTT : ORIGIN_CPU; + return ifbdev->fb->frontbuffer; +} - intel_fb_obj_invalidate(obj, origin); +static void intel_fbdev_invalidate(struct intel_fbdev *ifbdev) +{ + intel_frontbuffer_invalidate(to_frontbuffer(ifbdev), ORIGIN_CPU); } static int intel_fbdev_set_par(struct fb_info *info) @@ -120,7 +121,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper, struct drm_i915_private *dev_priv = to_i915(dev); struct drm_mode_fb_cmd2 mode_cmd = {}; struct drm_i915_gem_object *obj; - int size, ret; + int size; /* we don't do packed 24bpp */ if (sizes->surface_bpp == 24) @@ -147,24 +148,16 @@ static int intelfb_alloc(struct drm_fb_helper *helper, obj = i915_gem_object_create_shmem(dev_priv, size); if (IS_ERR(obj)) { DRM_ERROR("failed to allocate framebuffer\n"); - ret = PTR_ERR(obj); - goto err; + return PTR_ERR(obj); } fb = intel_framebuffer_create(obj, &mode_cmd); - if (IS_ERR(fb)) { - ret = PTR_ERR(fb); - goto err_obj; - } + i915_gem_object_put(obj); + if (IS_ERR(fb)) + return PTR_ERR(fb); ifbdev->fb = to_intel_framebuffer(fb); - return 0; - -err_obj: - i915_gem_object_put(obj); -err: - return ret; } static int intelfb_create(struct drm_fb_helper *helper, @@ -180,7 +173,6 @@ static int intelfb_create(struct drm_fb_helper *helper, const struct i915_ggtt_view view = { .type = I915_GGTT_VIEW_NORMAL, }; - struct drm_framebuffer *fb; intel_wakeref_t wakeref; struct fb_info *info; struct i915_vma *vma; @@ -226,8 +218,7 @@ static int intelfb_create(struct drm_fb_helper *helper, goto out_unlock; } - fb = &ifbdev->fb->base; - intel_fb_obj_flush(intel_fb_obj(fb), ORIGIN_DIRTYFB); + intel_frontbuffer_flush(to_frontbuffer(ifbdev), ORIGIN_DIRTYFB); info = drm_fb_helper_alloc_fbi(helper); if (IS_ERR(info)) { @@ -236,7 +227,7 @@ static int intelfb_create(struct drm_fb_helper *helper, goto out_unpin; } - ifbdev->helper.fb = fb; + ifbdev->helper.fb = &ifbdev->fb->base; info->fbops = &intelfb_ops; @@ -262,13 +253,14 @@ static int intelfb_create(struct drm_fb_helper *helper, * If the object is stolen however, it will be full of whatever * garbage was left in there. */ - if (intel_fb_obj(fb)->stolen && !prealloc) + if (vma->obj->stolen && !prealloc) memset_io(info->screen_base, 0, info->screen_size); /* Use default scratch pixmap (info->pixmap.flags = FB_PIXMAP_SYSTEM) */ DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08x\n", - fb->width, fb->height, i915_ggtt_offset(vma)); + ifbdev->fb->base.width, ifbdev->fb->base.height, + i915_ggtt_offset(vma)); ifbdev->vma = vma; ifbdev->vma_flags = flags; diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c index 44273c10cea5..d11031811a48 100644 --- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c +++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c @@ -30,11 +30,11 @@ * Many features require us to track changes to the currently active * frontbuffer, especially rendering targeted at the frontbuffer. * - * To be able to do so GEM tracks frontbuffers using a bitmask for all possible - * frontbuffer slots through i915_gem_track_fb(). The function in this file are - * then called when the contents of the frontbuffer are invalidated, when - * frontbuffer rendering has stopped again to flush out all the changes and when - * the frontbuffer is exchanged with a flip. Subsystems interested in + * To be able to do so we track frontbuffers using a bitmask for all possible + * frontbuffer slots through intel_frontbuffer_track(). The functions in this + * file are then called when the contents of the frontbuffer are invalidated, + * when frontbuffer rendering has stopped again to flush out all the changes + * and when the frontbuffer is exchanged with a flip. Subsystems interested in * frontbuffer changes (e.g. PSR, FBC, DRRS) should directly put their callbacks * into the relevant places and filter for the frontbuffer slots that they are * interested int. @@ -63,28 +63,9 @@ #include "intel_frontbuffer.h" #include "intel_psr.h" -void __intel_fb_obj_invalidate(struct drm_i915_gem_object *obj, - enum fb_op_origin origin, - unsigned int frontbuffer_bits) -{ - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - - if (origin == ORIGIN_CS) { - spin_lock(&dev_priv->fb_tracking.lock); - dev_priv->fb_tracking.busy_bits |= frontbuffer_bits; - dev_priv->fb_tracking.flip_bits &= ~frontbuffer_bits; - spin_unlock(&dev_priv->fb_tracking.lock); - } - - might_sleep(); - intel_psr_invalidate(dev_priv, frontbuffer_bits, origin); - intel_edp_drrs_invalidate(dev_priv, frontbuffer_bits); - intel_fbc_invalidate(dev_priv, frontbuffer_bits, origin); -} - /** - * intel_frontbuffer_flush - flush frontbuffer - * @dev_priv: i915 device + * frontbuffer_flush - flush frontbuffer + * @i915: i915 device * @frontbuffer_bits: frontbuffer plane tracking bits * @origin: which operation caused the flush * @@ -94,45 +75,27 @@ void __intel_fb_obj_invalidate(struct drm_i915_gem_object *obj, * * Can be called without any locks held. */ -static void intel_frontbuffer_flush(struct drm_i915_private *dev_priv, - unsigned frontbuffer_bits, - enum fb_op_origin origin) +static void frontbuffer_flush(struct drm_i915_private *i915, + unsigned int frontbuffer_bits, + enum fb_op_origin origin) { /* Delay flushing when rings are still busy.*/ - spin_lock(&dev_priv->fb_tracking.lock); - frontbuffer_bits &= ~dev_priv->fb_tracking.busy_bits; - spin_unlock(&dev_priv->fb_tracking.lock); + spin_lock(&i915->fb_tracking.lock); + frontbuffer_bits &= ~i915->fb_tracking.busy_bits; + spin_unlock(&i915->fb_tracking.lock); if (!frontbuffer_bits) return; might_sleep(); - intel_edp_drrs_flush(dev_priv, frontbuffer_bits); - intel_psr_flush(dev_priv, frontbuffer_bits, origin); - intel_fbc_flush(dev_priv, frontbuffer_bits, origin); -} - -void __intel_fb_obj_flush(struct drm_i915_gem_object *obj, - enum fb_op_origin origin, - unsigned int frontbuffer_bits) -{ - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - - if (origin == ORIGIN_CS) { - spin_lock(&dev_priv->fb_tracking.lock); - /* Filter out new bits since rendering started. */ - frontbuffer_bits &= dev_priv->fb_tracking.busy_bits; - dev_priv->fb_tracking.busy_bits &= ~frontbuffer_bits; - spin_unlock(&dev_priv->fb_tracking.lock); - } - - if (frontbuffer_bits) - intel_frontbuffer_flush(dev_priv, frontbuffer_bits, origin); + intel_edp_drrs_flush(i915, frontbuffer_bits); + intel_psr_flush(i915, frontbuffer_bits, origin); + intel_fbc_flush(i915, frontbuffer_bits, origin); } /** * intel_frontbuffer_flip_prepare - prepare asynchronous frontbuffer flip - * @dev_priv: i915 device + * @i915: i915 device * @frontbuffer_bits: frontbuffer plane tracking bits * * This function gets called after scheduling a flip on @obj. The actual @@ -142,19 +105,19 @@ void __intel_fb_obj_flush(struct drm_i915_gem_object *obj, * * Can be called without any locks held. */ -void intel_frontbuffer_flip_prepare(struct drm_i915_private *dev_priv, +void intel_frontbuffer_flip_prepare(struct drm_i915_private *i915, unsigned frontbuffer_bits) { - spin_lock(&dev_priv->fb_tracking.lock); - dev_priv->fb_tracking.flip_bits |= frontbuffer_bits; + spin_lock(&i915->fb_tracking.lock); + i915->fb_tracking.flip_bits |= frontbuffer_bits; /* Remove stale busy bits due to the old buffer. */ - dev_priv->fb_tracking.busy_bits &= ~frontbuffer_bits; - spin_unlock(&dev_priv->fb_tracking.lock); + i915->fb_tracking.busy_bits &= ~frontbuffer_bits; + spin_unlock(&i915->fb_tracking.lock); } /** * intel_frontbuffer_flip_complete - complete asynchronous frontbuffer flip - * @dev_priv: i915 device + * @i915: i915 device * @frontbuffer_bits: frontbuffer plane tracking bits * * This function gets called after the flip has been latched and will complete @@ -162,23 +125,22 @@ void intel_frontbuffer_flip_prepare(struct drm_i915_private *dev_priv, * * Can be called without any locks held. */ -void intel_frontbuffer_flip_complete(struct drm_i915_private *dev_priv, +void intel_frontbuffer_flip_complete(struct drm_i915_private *i915, unsigned frontbuffer_bits) { - spin_lock(&dev_priv->fb_tracking.lock); + spin_lock(&i915->fb_tracking.lock); /* Mask any cancelled flips. */ - frontbuffer_bits &= dev_priv->fb_tracking.flip_bits; - dev_priv->fb_tracking.flip_bits &= ~frontbuffer_bits; - spin_unlock(&dev_priv->fb_tracking.lock); + frontbuffer_bits &= i915->fb_tracking.flip_bits; + i915->fb_tracking.flip_bits &= ~frontbuffer_bits; + spin_unlock(&i915->fb_tracking.lock); if (frontbuffer_bits) - intel_frontbuffer_flush(dev_priv, - frontbuffer_bits, ORIGIN_FLIP); + frontbuffer_flush(i915, frontbuffer_bits, ORIGIN_FLIP); } /** * intel_frontbuffer_flip - synchronous frontbuffer flip - * @dev_priv: i915 device + * @i915: i915 device * @frontbuffer_bits: frontbuffer plane tracking bits * * This function gets called after scheduling a flip on @obj. This is for @@ -187,13 +149,160 @@ void intel_frontbuffer_flip_complete(struct drm_i915_private *dev_priv, * * Can be called without any locks held. */ -void intel_frontbuffer_flip(struct drm_i915_private *dev_priv, +void intel_frontbuffer_flip(struct drm_i915_private *i915, unsigned frontbuffer_bits) { - spin_lock(&dev_priv->fb_tracking.lock); + spin_lock(&i915->fb_tracking.lock); /* Remove stale busy bits due to the old buffer. */ - dev_priv->fb_tracking.busy_bits &= ~frontbuffer_bits; - spin_unlock(&dev_priv->fb_tracking.lock); + i915->fb_tracking.busy_bits &= ~frontbuffer_bits; + spin_unlock(&i915->fb_tracking.lock); - intel_frontbuffer_flush(dev_priv, frontbuffer_bits, ORIGIN_FLIP); + frontbuffer_flush(i915, frontbuffer_bits, ORIGIN_FLIP); +} + +void __intel_fb_invalidate(struct intel_frontbuffer *front, + enum fb_op_origin origin, + unsigned int frontbuffer_bits) +{ + struct drm_i915_private *i915 = to_i915(front->obj->base.dev); + + if (origin == ORIGIN_CS) { + spin_lock(&i915->fb_tracking.lock); + i915->fb_tracking.busy_bits |= frontbuffer_bits; + i915->fb_tracking.flip_bits &= ~frontbuffer_bits; + spin_unlock(&i915->fb_tracking.lock); + } + + might_sleep(); + intel_psr_invalidate(i915, frontbuffer_bits, origin); + intel_edp_drrs_invalidate(i915, frontbuffer_bits); + intel_fbc_invalidate(i915, frontbuffer_bits, origin); +} + +void __intel_fb_flush(struct intel_frontbuffer *front, + enum fb_op_origin origin, + unsigned int frontbuffer_bits) +{ + struct drm_i915_private *i915 = to_i915(front->obj->base.dev); + + if (origin == ORIGIN_CS) { + spin_lock(&i915->fb_tracking.lock); + /* Filter out new bits since rendering started. */ + frontbuffer_bits &= i915->fb_tracking.busy_bits; + i915->fb_tracking.busy_bits &= ~frontbuffer_bits; + spin_unlock(&i915->fb_tracking.lock); + } + + if (frontbuffer_bits) + frontbuffer_flush(i915, frontbuffer_bits, origin); +} + +static int frontbuffer_active(struct i915_active *ref) +{ + struct intel_frontbuffer *front = + container_of(ref, typeof(*front), write); + + kref_get(&front->ref); + return 0; +} + +static void frontbuffer_retire(struct i915_active *ref) +{ + struct intel_frontbuffer *front = + container_of(ref, typeof(*front), write); + + intel_frontbuffer_flush(front, ORIGIN_CS); + intel_frontbuffer_put(front); +} + +static void frontbuffer_release(struct kref *ref) + __releases(&to_i915(front->obj->base.dev)->fb_tracking.lock) +{ + struct intel_frontbuffer *front = + container_of(ref, typeof(*front), ref); + + front->obj->frontbuffer = NULL; + spin_unlock(&to_i915(front->obj->base.dev)->fb_tracking.lock); + + i915_gem_object_put(front->obj); + kfree(front); +} + +struct intel_frontbuffer * +intel_frontbuffer_get(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_frontbuffer *front; + + spin_lock(&i915->fb_tracking.lock); + front = obj->frontbuffer; + if (front) + kref_get(&front->ref); + spin_unlock(&i915->fb_tracking.lock); + if (front) + return front; + + front = kmalloc(sizeof(*front), GFP_KERNEL); + if (!front) + return NULL; + + front->obj = obj; + kref_init(&front->ref); + atomic_set(&front->bits, 0); + i915_active_init(i915, &front->write, + frontbuffer_active, frontbuffer_retire); + + spin_lock(&i915->fb_tracking.lock); + if (obj->frontbuffer) { + kfree(front); + front = obj->frontbuffer; + kref_get(&front->ref); + } else { + i915_gem_object_get(obj); + obj->frontbuffer = front; + } + spin_unlock(&i915->fb_tracking.lock); + + return front; +} + +void intel_frontbuffer_put(struct intel_frontbuffer *front) +{ + kref_put_lock(&front->ref, + frontbuffer_release, + &to_i915(front->obj->base.dev)->fb_tracking.lock); +} + +/** + * intel_frontbuffer_track - update frontbuffer tracking + * @old: current buffer for the frontbuffer slots + * @new: new buffer for the frontbuffer slots + * @frontbuffer_bits: bitmask of frontbuffer slots + * + * This updates the frontbuffer tracking bits @frontbuffer_bits by clearing them + * from @old and setting them in @new. Both @old and @new can be NULL. + */ +void intel_frontbuffer_track(struct intel_frontbuffer *old, + struct intel_frontbuffer *new, + unsigned int frontbuffer_bits) +{ + /* + * Control of individual bits within the mask are guarded by + * the owning plane->mutex, i.e. we can never see concurrent + * manipulation of individual bits. But since the bitfield as a whole + * is updated using RMW, we need to use atomics in order to update + * the bits. + */ + BUILD_BUG_ON(INTEL_FRONTBUFFER_BITS_PER_PIPE * I915_MAX_PIPES > + BITS_PER_TYPE(atomic_t)); + + if (old) { + WARN_ON(!(atomic_read(&old->bits) & frontbuffer_bits)); + atomic_andnot(frontbuffer_bits, &old->bits); + } + + if (new) { + WARN_ON(atomic_read(&new->bits) & frontbuffer_bits); + atomic_or(frontbuffer_bits, &new->bits); + } } diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.h b/drivers/gpu/drm/i915/display/intel_frontbuffer.h index 5727320c8084..adc64d61a4a5 100644 --- a/drivers/gpu/drm/i915/display/intel_frontbuffer.h +++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.h @@ -24,7 +24,10 @@ #ifndef __INTEL_FRONTBUFFER_H__ #define __INTEL_FRONTBUFFER_H__ -#include "gem/i915_gem_object.h" +#include +#include + +#include "i915_active.h" struct drm_i915_private; struct drm_i915_gem_object; @@ -37,23 +40,30 @@ enum fb_op_origin { ORIGIN_DIRTYFB, }; -void intel_frontbuffer_flip_prepare(struct drm_i915_private *dev_priv, +struct intel_frontbuffer { + struct kref ref; + atomic_t bits; + struct i915_active write; + struct drm_i915_gem_object *obj; +}; + +void intel_frontbuffer_flip_prepare(struct drm_i915_private *i915, unsigned frontbuffer_bits); -void intel_frontbuffer_flip_complete(struct drm_i915_private *dev_priv, +void intel_frontbuffer_flip_complete(struct drm_i915_private *i915, unsigned frontbuffer_bits); -void intel_frontbuffer_flip(struct drm_i915_private *dev_priv, +void intel_frontbuffer_flip(struct drm_i915_private *i915, unsigned frontbuffer_bits); -void __intel_fb_obj_invalidate(struct drm_i915_gem_object *obj, - enum fb_op_origin origin, - unsigned int frontbuffer_bits); -void __intel_fb_obj_flush(struct drm_i915_gem_object *obj, - enum fb_op_origin origin, - unsigned int frontbuffer_bits); +struct intel_frontbuffer * +intel_frontbuffer_get(struct drm_i915_gem_object *obj); + +void __intel_fb_invalidate(struct intel_frontbuffer *front, + enum fb_op_origin origin, + unsigned int frontbuffer_bits); /** - * intel_fb_obj_invalidate - invalidate frontbuffer object - * @obj: GEM object to invalidate + * intel_frontbuffer_invalidate - invalidate frontbuffer object + * @front: GEM object to invalidate * @origin: which operation caused the invalidation * * This function gets called every time rendering on the given object starts and @@ -62,37 +72,53 @@ void __intel_fb_obj_flush(struct drm_i915_gem_object *obj, * until the rendering completes or a flip on this frontbuffer plane is * scheduled. */ -static inline bool intel_fb_obj_invalidate(struct drm_i915_gem_object *obj, - enum fb_op_origin origin) +static inline bool intel_frontbuffer_invalidate(struct intel_frontbuffer *front, + enum fb_op_origin origin) { unsigned int frontbuffer_bits; - frontbuffer_bits = atomic_read(&obj->frontbuffer_bits); + if (!front) + return false; + + frontbuffer_bits = atomic_read(&front->bits); if (!frontbuffer_bits) return false; - __intel_fb_obj_invalidate(obj, origin, frontbuffer_bits); + __intel_fb_invalidate(front, origin, frontbuffer_bits); return true; } +void __intel_fb_flush(struct intel_frontbuffer *front, + enum fb_op_origin origin, + unsigned int frontbuffer_bits); + /** - * intel_fb_obj_flush - flush frontbuffer object - * @obj: GEM object to flush + * intel_frontbuffer_flush - flush frontbuffer object + * @front: GEM object to flush * @origin: which operation caused the flush * * This function gets called every time rendering on the given object has * completed and frontbuffer caching can be started again. */ -static inline void intel_fb_obj_flush(struct drm_i915_gem_object *obj, - enum fb_op_origin origin) +static inline void intel_frontbuffer_flush(struct intel_frontbuffer *front, + enum fb_op_origin origin) { unsigned int frontbuffer_bits; - frontbuffer_bits = atomic_read(&obj->frontbuffer_bits); + if (!front) + return; + + frontbuffer_bits = atomic_read(&front->bits); if (!frontbuffer_bits) return; - __intel_fb_obj_flush(obj, origin, frontbuffer_bits); + __intel_fb_flush(front, origin, frontbuffer_bits); } +void intel_frontbuffer_track(struct intel_frontbuffer *old, + struct intel_frontbuffer *new, + unsigned int frontbuffer_bits); + +void intel_frontbuffer_put(struct intel_frontbuffer *front); + #endif /* __INTEL_FRONTBUFFER_H__ */ diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c index 9b3f4f6644f3..1a15fa34205c 100644 --- a/drivers/gpu/drm/i915/display/intel_overlay.c +++ b/drivers/gpu/drm/i915/display/intel_overlay.c @@ -281,9 +281,9 @@ static void intel_overlay_flip_prepare(struct intel_overlay *overlay, WARN_ON(overlay->old_vma); - i915_gem_track_fb(overlay->vma ? overlay->vma->obj : NULL, - vma ? vma->obj : NULL, - INTEL_FRONTBUFFER_OVERLAY(pipe)); + intel_frontbuffer_track(overlay->vma ? overlay->vma->obj->frontbuffer : NULL, + vma ? vma->obj->frontbuffer : NULL, + INTEL_FRONTBUFFER_OVERLAY(pipe)); intel_frontbuffer_flip_prepare(overlay->i915, INTEL_FRONTBUFFER_OVERLAY(pipe)); @@ -768,7 +768,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, ret = PTR_ERR(vma); goto out_pin_section; } - intel_fb_obj_flush(new_bo, ORIGIN_DIRTYFB); + intel_frontbuffer_flush(new_bo->frontbuffer, ORIGIN_DIRTYFB); ret = i915_vma_put_fence(vma); if (ret) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index 5295285d5843..a65d401f891c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -48,7 +48,7 @@ static void __i915_do_clflush(struct drm_i915_gem_object *obj) { GEM_BUG_ON(!i915_gem_object_has_pages(obj)); drm_clflush_sg(obj->mm.pages); - intel_fb_obj_flush(obj, ORIGIN_CPU); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU); } static void i915_clflush_work(struct work_struct *work) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 2e3ce2a69653..a1afc2690e9e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -551,13 +551,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) return 0; } -static inline enum fb_op_origin -fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) -{ - return (domain == I915_GEM_DOMAIN_GTT ? - obj->frontbuffer_ggtt_origin : ORIGIN_CPU); -} - /** * Called when user space prepares to use an object with the CPU, either * through the mmap ioctl's mapping or a GTT mapping. @@ -661,9 +654,8 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, i915_gem_object_unlock(obj); - if (write_domain != 0) - intel_fb_obj_invalidate(obj, - fb_write_origin(obj, write_domain)); + if (write_domain) + intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CPU); out_unpin: i915_gem_object_unpin_pages(obj); @@ -783,7 +775,7 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, } out: - intel_fb_obj_invalidate(obj, ORIGIN_CPU); + intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */ return 0; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index a564c1e4231b..71d10ae90922 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -101,9 +101,6 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, up_write(&mm->mmap_sem); if (IS_ERR_VALUE(addr)) goto err; - - /* This may race, but that's ok, it only gets set */ - WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU); } i915_gem_object_put(obj); @@ -283,7 +280,6 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf) * Userspace is now writing through an untracked VMA, abandon * all hope that the hardware is able to track future writes. */ - obj->frontbuffer_ggtt_origin = ORIGIN_CPU; vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); if (IS_ERR(vma) && !view.type) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index eccd7f4768f8..cfed28cc6185 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -45,16 +45,6 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj) return kmem_cache_free(global.slab_objects, obj); } -static void -frontbuffer_retire(struct i915_active_request *active, - struct i915_request *request) -{ - struct drm_i915_gem_object *obj = - container_of(active, typeof(*obj), frontbuffer_write); - - intel_fb_obj_flush(obj, ORIGIN_CS); -} - void i915_gem_object_init(struct drm_i915_gem_object *obj, const struct drm_i915_gem_object_ops *ops) { @@ -71,10 +61,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, obj->ops = ops; - obj->frontbuffer_ggtt_origin = ORIGIN_GTT; - i915_active_request_init(&obj->frontbuffer_write, - NULL, frontbuffer_retire); - obj->mm.madv = I915_MADV_WILLNEED; INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->mm.get_page.lock); @@ -186,7 +172,6 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, GEM_BUG_ON(atomic_read(&obj->bind_count)); GEM_BUG_ON(obj->userfault_count); - GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits)); GEM_BUG_ON(!list_empty(&obj->lut_list)); atomic_set(&obj->mm.pages_pin_count, 0); @@ -259,6 +244,8 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); struct drm_i915_private *i915 = to_i915(obj->base.dev); + GEM_BUG_ON(i915_gem_object_is_framebuffer(obj)); + /* * Before we free the object, make sure any pure RCU-only * read-side critical sections are complete, e.g. @@ -290,13 +277,6 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj) queue_work(i915->wq, &i915->mm.free_work); } -static inline enum fb_op_origin -fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) -{ - return (domain == I915_GEM_DOMAIN_GTT ? - obj->frontbuffer_ggtt_origin : ORIGIN_CPU); -} - static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj) { return !(obj->cache_level == I915_CACHE_NONE || @@ -319,8 +299,7 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj, for_each_ggtt_vma(vma, obj) intel_gt_flush_ggtt_writes(vma->vm->gt); - intel_fb_obj_flush(obj, - fb_write_origin(obj, I915_GEM_DOMAIN_GTT)); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU); for_each_ggtt_vma(vma, obj) { if (vma->iomap) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3714cf234d64..abc23e7e13a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -161,7 +161,7 @@ i915_gem_object_needs_async_cancel(const struct drm_i915_gem_object *obj) static inline bool i915_gem_object_is_framebuffer(const struct drm_i915_gem_object *obj) { - return READ_ONCE(obj->framebuffer_references); + return READ_ONCE(obj->frontbuffer); } static inline unsigned int diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index d474c6ac4100..ede0eb4218a8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -13,6 +13,7 @@ #include "i915_selftest.h" struct drm_i915_gem_object; +struct intel_fronbuffer; /* * struct i915_lut_handle tracks the fast lookups from handle to vma used @@ -141,9 +142,7 @@ struct drm_i915_gem_object { */ u16 write_domain; - atomic_t frontbuffer_bits; - unsigned int frontbuffer_ggtt_origin; /* write once */ - struct i915_active_request frontbuffer_write; + struct intel_frontbuffer *frontbuffer; /** Current tiling stride for the object, if it's tiled. */ unsigned int tiling_and_stride; @@ -224,9 +223,6 @@ struct drm_i915_gem_object { bool quirked:1; } mm; - /** References from framebuffers, locks out tiling changes. */ - unsigned int framebuffer_references; - /** Record of address bit 17 of each page at last unbind. */ unsigned long *bit_17; diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index afa22ce209a2..6f428819ee72 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -136,7 +136,6 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) struct drm_i915_private *dev_priv = to_i915(obj->base.dev); struct intel_engine_cs *engine; struct i915_vma *vma; - unsigned int frontbuffer_bits; int pin_count = 0; seq_printf(m, "%pK: %c%c%c%c %8zdKiB %02x %02x %s%s%s", @@ -226,10 +225,6 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) engine = i915_gem_object_last_write_engine(obj); if (engine) seq_printf(m, " (%s)", engine->name); - - frontbuffer_bits = atomic_read(&obj->frontbuffer_bits); - if (frontbuffer_bits) - seq_printf(m, " (frontbuffer: 0x%03x)", frontbuffer_bits); } struct file_stats { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 18e110f9df92..193db3da8f14 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2468,10 +2468,6 @@ int i915_gem_mmap_gtt(struct drm_file *file_priv, struct drm_device *dev, u32 handle, u64 *offset); int i915_gem_mmap_gtt_version(void); -void i915_gem_track_fb(struct drm_i915_gem_object *old, - struct drm_i915_gem_object *new, - unsigned frontbuffer_bits); - int __must_check i915_gem_set_global_seqno(struct drm_device *dev, u32 seqno); static inline u32 i915_reset_count(struct i915_gpu_error *error) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c7650f15d36b..35b684db6960 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -138,17 +138,19 @@ i915_gem_phys_pwrite(struct drm_i915_gem_object *obj, void *vaddr = obj->phys_handle->vaddr + args->offset; char __user *user_data = u64_to_user_ptr(args->data_ptr); - /* We manually control the domain here and pretend that it + /* + * We manually control the domain here and pretend that it * remains coherent i.e. in the GTT domain, like shmem_pwrite. */ - intel_fb_obj_invalidate(obj, ORIGIN_CPU); + intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CPU); + if (copy_from_user(vaddr, user_data, args->size)) return -EFAULT; drm_clflush_virt_range(vaddr, args->size); intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt); - intel_fb_obj_flush(obj, ORIGIN_CPU); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU); return 0; } @@ -593,7 +595,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, goto out_unpin; } - intel_fb_obj_invalidate(obj, ORIGIN_CPU); + intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CPU); user_data = u64_to_user_ptr(args->data_ptr); offset = args->offset; @@ -635,7 +637,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, user_data += page_length; offset += page_length; } - intel_fb_obj_flush(obj, ORIGIN_CPU); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU); i915_gem_object_unlock_fence(obj, fence); out_unpin: @@ -728,7 +730,7 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, offset = 0; } - intel_fb_obj_flush(obj, ORIGIN_CPU); + intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU); i915_gem_object_unlock_fence(obj, fence); return ret; @@ -1783,39 +1785,6 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file) return ret; } -/** - * i915_gem_track_fb - update frontbuffer tracking - * @old: current GEM buffer for the frontbuffer slots - * @new: new GEM buffer for the frontbuffer slots - * @frontbuffer_bits: bitmask of frontbuffer slots - * - * This updates the frontbuffer tracking bits @frontbuffer_bits by clearing them - * from @old and setting them in @new. Both @old and @new can be NULL. - */ -void i915_gem_track_fb(struct drm_i915_gem_object *old, - struct drm_i915_gem_object *new, - unsigned frontbuffer_bits) -{ - /* Control of individual bits within the mask are guarded by - * the owning plane->mutex, i.e. we can never see concurrent - * manipulation of individual bits. But since the bitfield as a whole - * is updated using RMW, we need to use atomics in order to update - * the bits. - */ - BUILD_BUG_ON(INTEL_FRONTBUFFER_BITS_PER_PIPE * I915_MAX_PIPES > - BITS_PER_TYPE(atomic_t)); - - if (old) { - WARN_ON(!(atomic_read(&old->frontbuffer_bits) & frontbuffer_bits)); - atomic_andnot(frontbuffer_bits, &old->frontbuffer_bits); - } - - if (new) { - WARN_ON(atomic_read(&new->frontbuffer_bits) & frontbuffer_bits); - atomic_or(frontbuffer_bits, &new->frontbuffer_bits); - } -} - #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/mock_gem_device.c" #include "selftests/i915_gem.c" diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 85a8e6fd34d5..c387770c3764 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -927,8 +927,10 @@ int i915_vma_move_to_active(struct i915_vma *vma, if (flags & EXEC_OBJECT_WRITE) { obj->write_domain = I915_GEM_DOMAIN_RENDER; - if (intel_fb_obj_invalidate(obj, ORIGIN_CS)) - __i915_active_request_set(&obj->frontbuffer_write, rq); + if (intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CS)) + i915_active_ref(&obj->frontbuffer->write, + rq->fence.context, + rq); obj->read_domains = 0; } diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index c4016164c34e..9ad9d1bde5f4 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -69,6 +69,7 @@ enum intel_output_type { struct intel_framebuffer { struct drm_framebuffer base; + struct intel_frontbuffer *frontbuffer; struct intel_rotation_info rot_info; /* for each plane in the normal GTT view */ From patchwork Fri Jul 26 08:46:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11060547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEC831398 for ; Fri, 26 Jul 2019 08:46:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9DB63205FC for ; Fri, 26 Jul 2019 08:46:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9225428A3B; Fri, 26 Jul 2019 08:46:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CB4C1205FC for ; Fri, 26 Jul 2019 08:46:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 313356E8B2; Fri, 26 Jul 2019 08:46:33 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id A52EA6E8A7 for ; Fri, 26 Jul 2019 08:46:26 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17621102-1500050 for multiple; Fri, 26 Jul 2019 09:46:18 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Jul 2019 09:46:13 +0100 Message-Id: <20190726084613.22129-27-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190726084613.22129-1-chris@chris-wilson.co.uk> References: <20190726084613.22129-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 27/27] drm/i915: Markup expected timeline locks for i915_active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP As every i915_active_request should be serialised by a dedicated lock, i915_active consists of a tree of locks; one for each node. Markup up the i915_active_request with what lock is supposed to be guarding it so that we can verify that the serialised updated are indeed serialised. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/display/intel_overlay.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_client_blt.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_pool.h | 2 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 7 +++---- drivers/gpu/drm/i915/gt/selftest_timeline.c | 4 ++++ .../gpu/drm/i915/gt/selftests/mock_timeline.c | 2 +- drivers/gpu/drm/i915/i915_active.c | 16 +++++++++++----- drivers/gpu/drm/i915/i915_active.h | 12 ++++++++++-- drivers/gpu/drm/i915/i915_active_types.h | 3 +++ drivers/gpu/drm/i915/i915_vma.c | 4 ++-- drivers/gpu/drm/i915/selftests/i915_active.c | 3 +-- 13 files changed, 40 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c index 1a15fa34205c..d5e06b463567 100644 --- a/drivers/gpu/drm/i915/display/intel_overlay.c +++ b/drivers/gpu/drm/i915/display/intel_overlay.c @@ -230,7 +230,7 @@ alloc_request(struct intel_overlay *overlay, void (*fn)(struct intel_overlay *)) if (IS_ERR(rq)) return rq; - err = i915_active_ref(&overlay->last_flip, rq->fence.context, rq); + err = i915_active_ref(&overlay->last_flip, rq->timeline, rq); if (err) { i915_request_add(rq); return ERR_PTR(err); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 2312a0c6af89..f15b0b82e562 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -197,7 +197,7 @@ static void clear_pages_worker(struct work_struct *work) * keep track of the GPU activity within this vma/request, and * propagate the signal from the request to w->dma. */ - err = i915_active_ref(&vma->active, rq->fence.context, rq); + err = i915_active_ref(&vma->active, rq->timeline, rq); if (err) goto out_request; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index e31431fa141e..e595fa239674 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -927,7 +927,7 @@ static int context_barrier_task(struct i915_gem_context *ctx, if (emit) err = emit(rq, data); if (err == 0) - err = i915_active_ref(&cb->base, rq->fence.context, rq); + err = i915_active_ref(&cb->base, rq->timeline, rq); i915_request_add(rq); if (err) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 2bae9bcffc1c..8d647ff1d300 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -300,7 +300,7 @@ int intel_context_prepare_remote_request(struct intel_context *ce, * words transfer the pinned ce object to tracked active request. */ GEM_BUG_ON(i915_active_is_idle(&ce->active)); - err = i915_active_ref(&ce->active, rq->fence.context, rq); + err = i915_active_ref(&ce->active, rq->timeline, rq); unlock: if (rq->timeline != tl) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.h b/drivers/gpu/drm/i915/gt/intel_engine_pool.h index f7a0a660c1c9..8d069efd9457 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pool.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.h @@ -18,7 +18,7 @@ static inline int intel_engine_pool_mark_active(struct intel_engine_pool_node *node, struct i915_request *rq) { - return i915_active_ref(&node->active, rq->fence.context, rq); + return i915_active_ref(&node->active, rq->timeline, rq); } static inline void diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index eafd94d5e211..02fbe11b671b 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -254,7 +254,7 @@ int intel_timeline_init(struct intel_timeline *timeline, mutex_init(&timeline->mutex); - INIT_ACTIVE_REQUEST(&timeline->last_request); + INIT_ACTIVE_REQUEST(&timeline->last_request, &timeline->mutex); INIT_LIST_HEAD(&timeline->requests); i915_syncmap_init(&timeline->sync); @@ -440,8 +440,7 @@ __intel_timeline_get_seqno(struct intel_timeline *tl, * free it after the current request is retired, which ensures that * all writes into the cacheline from previous requests are complete. */ - err = i915_active_ref(&tl->hwsp_cacheline->active, - tl->fence_context, rq); + err = i915_active_ref(&tl->hwsp_cacheline->active, tl, rq); if (err) goto err_cacheline; @@ -492,7 +491,7 @@ int intel_timeline_get_seqno(struct intel_timeline *tl, static int cacheline_ref(struct intel_timeline_cacheline *cl, struct i915_request *rq) { - return i915_active_ref(&cl->active, rq->fence.context, rq); + return i915_active_ref(&cl->active, rq->timeline, rq); } int intel_timeline_read_hwsp(struct i915_request *from, diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c index d54113697745..321481403165 100644 --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c @@ -689,7 +689,9 @@ static int live_hwsp_wrap(void *arg) tl->seqno = -4u; + mutex_lock_nested(&tl->mutex, SINGLE_DEPTH_NESTING); err = intel_timeline_get_seqno(tl, rq, &seqno[0]); + mutex_unlock(&tl->mutex); if (err) { i915_request_add(rq); goto out; @@ -704,7 +706,9 @@ static int live_hwsp_wrap(void *arg) } hwsp_seqno[0] = tl->hwsp_seqno; + mutex_lock_nested(&tl->mutex, SINGLE_DEPTH_NESTING); err = intel_timeline_get_seqno(tl, rq, &seqno[1]); + mutex_unlock(&tl->mutex); if (err) { i915_request_add(rq); goto out; diff --git a/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c b/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c index 5c549205828a..598170efcaf6 100644 --- a/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftests/mock_timeline.c @@ -15,7 +15,7 @@ void mock_timeline_init(struct intel_timeline *timeline, u64 context) mutex_init(&timeline->mutex); - INIT_ACTIVE_REQUEST(&timeline->last_request); + INIT_ACTIVE_REQUEST(&timeline->last_request, &timeline->mutex); INIT_LIST_HEAD(&timeline->requests); i915_syncmap_init(&timeline->sync); diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index bbb44b072d1a..034975714aa3 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -131,10 +131,11 @@ node_retire(struct i915_active_request *base, struct i915_request *rq) } static struct i915_active_request * -active_instance(struct i915_active *ref, u64 idx) +active_instance(struct i915_active *ref, struct intel_timeline *tl) { struct active_node *node, *prealloc; struct rb_node **p, *parent; + u64 idx = tl->fence_context; /* * We track the most recently used timeline to skip a rbtree search @@ -173,7 +174,7 @@ active_instance(struct i915_active *ref, u64 idx) } node = prealloc; - i915_active_request_init(&node->base, NULL, node_retire); + i915_active_request_init(&node->base, &tl->mutex, NULL, node_retire); node->ref = ref; node->timeline = idx; @@ -208,20 +209,21 @@ void __i915_active_init(struct drm_i915_private *i915, } int i915_active_ref(struct i915_active *ref, - u64 timeline, + struct intel_timeline *tl, struct i915_request *rq) { struct i915_active_request *active; int err; - GEM_BUG_ON(!timeline); /* reserved for idle-barrier */ + GEM_BUG_ON(!tl->fence_context); /* reserved for idle-barrier */ + lockdep_assert_held(&tl->mutex); /* Prevent reaping in case we malloc/wait while building the tree */ err = i915_active_acquire(ref); if (err) return err; - active = active_instance(ref, timeline); + active = active_instance(ref, tl); if (!active) { err = -ENOMEM; goto out; @@ -510,6 +512,10 @@ int i915_active_request_set(struct i915_active_request *active, { int err; +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) + lockdep_assert_held(active->lock); +#endif + /* Must maintain ordering wrt previous active requests */ err = i915_request_await_active_request(rq, active); if (err) diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index ad28b866eb87..73425224d934 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -58,15 +58,20 @@ void i915_active_retire_noop(struct i915_active_request *active, */ static inline void i915_active_request_init(struct i915_active_request *active, + struct mutex *lock, struct i915_request *rq, i915_active_retire_fn retire) { RCU_INIT_POINTER(active->request, rq); INIT_LIST_HEAD(&active->link); active->retire = retire ?: i915_active_retire_noop; +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) + active->lock = lock; +#endif } -#define INIT_ACTIVE_REQUEST(name) i915_active_request_init((name), NULL, NULL) +#define INIT_ACTIVE_REQUEST(name, lock) \ + i915_active_request_init((name), (lock), NULL, NULL) /** * i915_active_request_set - updates the tracker to watch the current request @@ -81,6 +86,9 @@ static inline void __i915_active_request_set(struct i915_active_request *active, struct i915_request *request) { +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) + lockdep_assert_held(active->lock); +#endif list_move(&active->link, &request->active_list); rcu_assign_pointer(active->request, request); } @@ -362,7 +370,7 @@ void __i915_active_init(struct drm_i915_private *i915, } while (0) int i915_active_ref(struct i915_active *ref, - u64 timeline, + struct intel_timeline *tl, struct i915_request *rq); int i915_active_wait(struct i915_active *ref); diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h index 74743dd0d5f0..a21ec7e00f5a 100644 --- a/drivers/gpu/drm/i915/i915_active_types.h +++ b/drivers/gpu/drm/i915/i915_active_types.h @@ -24,6 +24,9 @@ struct i915_active_request { struct i915_request __rcu *request; struct list_head link; i915_active_retire_fn retire; +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) + struct mutex *lock; +#endif }; struct active_node; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index c387770c3764..f7f516a75c66 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -919,7 +919,7 @@ int i915_vma_move_to_active(struct i915_vma *vma, * add the active reference first and queue for it to be dropped * *last*. */ - err = i915_active_ref(&vma->active, rq->fence.context, rq); + err = i915_active_ref(&vma->active, rq->timeline, rq); if (unlikely(err)) return err; @@ -929,7 +929,7 @@ int i915_vma_move_to_active(struct i915_vma *vma, if (intel_frontbuffer_invalidate(obj->frontbuffer, ORIGIN_CS)) i915_active_ref(&obj->frontbuffer->write, - rq->fence.context, + rq->timeline, rq); obj->read_domains = 0; diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c index e5cd5d47e380..77d844ac8b71 100644 --- a/drivers/gpu/drm/i915/selftests/i915_active.c +++ b/drivers/gpu/drm/i915/selftests/i915_active.c @@ -110,8 +110,7 @@ __live_active_setup(struct drm_i915_private *i915) submit, GFP_KERNEL); if (err >= 0) - err = i915_active_ref(&active->base, - rq->fence.context, rq); + err = i915_active_ref(&active->base, rq->timeline, rq); i915_request_add(rq); if (err) { pr_err("Failed to track active ref!\n");