From patchwork Mon Jun 25 09:48:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10485411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D7958604D3 for ; Mon, 25 Jun 2018 09:49:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D3FAD2881B for ; Mon, 25 Jun 2018 09:49:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C889728946; Mon, 25 Jun 2018 09:49:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2CC3B28946 for ; Mon, 25 Jun 2018 09:49:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 06A5C6E18D; Mon, 25 Jun 2018 09:49:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B17726E187 for ; Mon, 25 Jun 2018 09:49:30 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12149669-1500050 for multiple; Mon, 25 Jun 2018 10:49:18 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Mon, 25 Jun 2018 10:49:19 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 25 Jun 2018 10:48:28 +0100 Message-Id: <20180625094842.8499-17-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180625094842.8499-1-chris@chris-wilson.co.uk> References: <20180625094842.8499-1-chris@chris-wilson.co.uk> X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Subject: [Intel-gfx] [PATCH 17/31] drm/i915: Combine multiple internal plists into the same i915_priolist bucket X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP As we are about to allow ourselves to slightly bump the user priority into a few different sublevels, packthose internal priority lists into the same i915_priolist to keep the rbtree compact and avoid having to allocate the default user priority even after the internal bumping. The downside to having an requests[] rather than a node per active list, is that we then have to walk over the empty higher priority lists. To compensate, we track the active buckets and use a small bitmap to skip over any inactive ones. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_engine_cs.c | 6 +- drivers/gpu/drm/i915/intel_guc_submission.c | 12 ++- drivers/gpu/drm/i915/intel_lrc.c | 87 ++++++++++++++------- drivers/gpu/drm/i915/intel_ringbuffer.h | 13 ++- 4 files changed, 80 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 78689367c7a6..c5a4a8f7cc49 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1615,10 +1615,10 @@ void intel_engine_dump(struct intel_engine_cs *engine, count = 0; drm_printf(m, "\t\tQueue priority: %d\n", execlists->queue_priority); for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) { - struct i915_priolist *p = - rb_entry(rb, typeof(*p), node); + struct i915_priolist *p = rb_entry(rb, typeof(*p), node); + int i; - list_for_each_entry(rq, &p->requests, sched.link) { + priolist_for_each_request(rq, p, i) { if (count++ < MAX_REQUESTS_TO_SHOW - 1) print_request(m, rq, "\t\tQ "); else diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index 9a2c6856a71e..660c41ec71ab 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -718,30 +718,28 @@ static bool __guc_dequeue(struct intel_engine_cs *engine) while ((rb = rb_first_cached(&execlists->queue))) { struct i915_priolist *p = to_priolist(rb); struct i915_request *rq, *rn; + int i; - list_for_each_entry_safe(rq, rn, &p->requests, sched.link) { + priolist_for_each_request_consume(rq, rn, p, i) { if (last && rq->hw_context != last->hw_context) { - if (port == last_port) { - __list_del_many(&p->requests, - &rq->sched.link); + if (port == last_port) goto done; - } if (submit) port_assign(port, last); port++; } - INIT_LIST_HEAD(&rq->sched.link); + list_del_init(&rq->sched.link); __i915_request_submit(rq); trace_i915_request_in(rq, port_index(port, execlists)); + last = rq; submit = true; } rb_erase_cached(&p->node, &execlists->queue); - INIT_LIST_HEAD(&p->requests); if (p->priority != I915_PRIORITY_NORMAL) kmem_cache_free(engine->i915->priorities, p); } diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index e55067ec41f7..98c922808e3e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -259,14 +259,49 @@ intel_lr_context_descriptor_update(struct i915_gem_context *ctx, ce->lrc_desc = desc; } -static struct i915_priolist * +static void assert_priolists(struct intel_engine_execlists * const execlists, + int queue_priority) +{ + struct rb_node *rb; + int last_prio, i; + + if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) + return; + + GEM_BUG_ON(rb_first_cached(&execlists->queue) != + rb_first(&execlists->queue.rb_root)); + + last_prio = (queue_priority >> I915_USER_PRIORITY_SHIFT) + 1; + for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) { + struct i915_priolist *p = to_priolist(rb); + + GEM_BUG_ON(p->priority >= last_prio); + last_prio = p->priority; + + GEM_BUG_ON(!p->used); + for (i = 0; i < ARRAY_SIZE(p->requests); i++) { + if (list_empty(&p->requests[i])) + continue; + + GEM_BUG_ON(!(p->used & BIT(i))); + } + } +} + +static struct list_head * lookup_priolist(struct intel_engine_cs *engine, int prio) { struct intel_engine_execlists * const execlists = &engine->execlists; struct i915_priolist *p; struct rb_node **parent, *rb; bool first = true; + int idx, i; + + assert_priolists(execlists, INT_MAX); + /* buckets sorted from highest [in slot 0] to lowest priority */ + idx = I915_PRIORITY_COUNT - (prio & ~I915_PRIORITY_MASK) - 1; + prio >>= I915_USER_PRIORITY_SHIFT; if (unlikely(execlists->no_priolist)) prio = I915_PRIORITY_NORMAL; @@ -283,7 +318,7 @@ lookup_priolist(struct intel_engine_cs *engine, int prio) parent = &rb->rb_right; first = false; } else { - return p; + goto out; } } @@ -309,11 +344,15 @@ lookup_priolist(struct intel_engine_cs *engine, int prio) } p->priority = prio; - INIT_LIST_HEAD(&p->requests); + for (i = 0; i < ARRAY_SIZE(p->requests); i++) + INIT_LIST_HEAD(&p->requests[i]); rb_link_node(&p->node, rb, parent); rb_insert_color_cached(&p->node, &execlists->queue, first); + p->used = 0; - return p; +out: + p->used |= BIT(idx); + return &p->requests[idx]; } static void unwind_wa_tail(struct i915_request *rq) @@ -325,7 +364,7 @@ static void unwind_wa_tail(struct i915_request *rq) static void __unwind_incomplete_requests(struct intel_engine_cs *engine) { struct i915_request *rq, *rn; - struct i915_priolist *uninitialized_var(p); + struct list_head *uninitialized_var(pl); int last_prio = I915_PRIORITY_INVALID; lockdep_assert_held(&engine->timeline.lock); @@ -342,11 +381,10 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine) GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); if (rq_prio(rq) != last_prio) { last_prio = rq_prio(rq); - p = lookup_priolist(engine, last_prio); + pl = lookup_priolist(engine, last_prio); } - GEM_BUG_ON(p->priority != rq_prio(rq)); - list_add(&rq->sched.link, &p->requests); + list_add(&rq->sched.link, pl); } } @@ -664,8 +702,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) while ((rb = rb_first_cached(&execlists->queue))) { struct i915_priolist *p = to_priolist(rb); struct i915_request *rq, *rn; + int i; - list_for_each_entry_safe(rq, rn, &p->requests, sched.link) { + priolist_for_each_request_consume(rq, rn, p, i) { /* * Can we combine this request with the current port? * It has to be the same context/ringbuffer and not @@ -684,11 +723,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * combine this request with the last, then we * are done. */ - if (port == last_port) { - __list_del_many(&p->requests, - &rq->sched.link); + if (port == last_port) goto done; - } /* * If GVT overrides us we only ever submit @@ -698,11 +734,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * request) to the second port. */ if (ctx_single_port_submission(last->hw_context) || - ctx_single_port_submission(rq->hw_context)) { - __list_del_many(&p->requests, - &rq->sched.link); + ctx_single_port_submission(rq->hw_context)) goto done; - } GEM_BUG_ON(last->hw_context == rq->hw_context); @@ -713,15 +746,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine) GEM_BUG_ON(port_isset(port)); } - INIT_LIST_HEAD(&rq->sched.link); + list_del_init(&rq->sched.link); + __i915_request_submit(rq); trace_i915_request_in(rq, port_index(port, execlists)); + last = rq; submit = true; } rb_erase_cached(&p->node, &execlists->queue); - INIT_LIST_HEAD(&p->requests); if (p->priority != I915_PRIORITY_NORMAL) kmem_cache_free(engine->i915->priorities, p); } @@ -745,6 +779,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) */ execlists->queue_priority = port != execlists->port ? rq_prio(last) : INT_MIN; + assert_priolists(execlists, execlists->queue_priority); if (submit) { port_assign(port, last); @@ -904,16 +939,16 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) /* Flush the queued requests to the timeline list (for retiring). */ while ((rb = rb_first_cached(&execlists->queue))) { struct i915_priolist *p = to_priolist(rb); + int i; - list_for_each_entry_safe(rq, rn, &p->requests, sched.link) { - INIT_LIST_HEAD(&rq->sched.link); + priolist_for_each_request_consume(rq, rn, p, i) { + list_del_init(&rq->sched.link); dma_fence_set_error(&rq->fence, -EIO); __i915_request_submit(rq); } rb_erase_cached(&p->node, &execlists->queue); - INIT_LIST_HEAD(&p->requests); if (p->priority != I915_PRIORITY_NORMAL) kmem_cache_free(engine->i915->priorities, p); } @@ -1114,8 +1149,7 @@ static void queue_request(struct intel_engine_cs *engine, struct i915_sched_node *node, int prio) { - list_add_tail(&node->link, - &lookup_priolist(engine, prio)->requests); + list_add_tail(&node->link, lookup_priolist(engine, prio)); } static void __update_queue(struct intel_engine_cs *engine, int prio) @@ -1185,7 +1219,7 @@ sched_lock_engine(struct i915_sched_node *node, struct intel_engine_cs *locked) static void execlists_schedule(struct i915_request *request, const struct i915_sched_attr *attr) { - struct i915_priolist *uninitialized_var(pl); + struct list_head *uninitialized_var(pl); struct intel_engine_cs *engine, *last; struct i915_dependency *dep, *p; struct i915_dependency stack; @@ -1280,8 +1314,7 @@ static void execlists_schedule(struct i915_request *request, pl = lookup_priolist(engine, prio); last = engine; } - GEM_BUG_ON(pl->priority != prio); - list_move_tail(&node->link, &pl->requests); + list_move_tail(&node->link, pl); } if (prio > engine->execlists.queue_priority && diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 07b64614cfdb..cc5ff6b4b567 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -188,11 +188,22 @@ enum intel_engine_id { }; struct i915_priolist { + struct list_head requests[I915_PRIORITY_COUNT]; struct rb_node node; - struct list_head requests; + unsigned long used; int priority; }; +#define priolist_for_each_request(it, plist, idx) \ + for (idx = 0; idx < ARRAY_SIZE((plist)->requests); idx++) \ + list_for_each_entry(it, &(plist)->requests[idx], sched.link) + +#define priolist_for_each_request_consume(it, n, plist, idx) \ + for (; (idx = ffs((plist)->used)); (plist)->used &= ~BIT(idx - 1)) \ + list_for_each_entry_safe(it, n, \ + &(plist)->requests[idx - 1], \ + sched.link) + /** * struct intel_engine_execlists - execlist submission queue and port state *