From patchwork Mon May 25 20:28:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11569435 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B6D3D60D for ; Mon, 25 May 2020 20:29:39 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9FB8120776 for ; Mon, 25 May 2020 20:29:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9FB8120776 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 31E1289DFE; Mon, 25 May 2020 20:29:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7C4E089DFE for ; Mon, 25 May 2020 20:29:37 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21290259-1500050 for multiple; Mon, 25 May 2020 21:28:58 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 25 May 2020 21:28:58 +0100 Message-Id: <20200525202901.32244-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/4] drm/i915/gt: Do not schedule normal requests immediately along virtual X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stable@vger.kernel.org, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" When we push a virtual request onto the HW, we update the rq->engine to point to the physical engine. A request that is then submitted by the user that waits upon the virtual engine, but along the physical engine in use, will then see that it is due to be submitted to the same engine and take a shortcut (and be queued without waiting for the completion fence). However, the virtual request may be preempted (either by higher priority users, or by timeslicing) and removed from the physical engine to be migrated over to one of its siblings. The dependent normal request however is oblivious to the removal of the virtual request and remains queued to execute on HW, believing that once it reaches the head of its queue all of its predecessors will have completed executing! Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine") Testcase: igt/gem_exec_balancer/sliced Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: # v5.3+ Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index c282719ad3ac..51588209bddd 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1074,7 +1074,7 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from) return ret; } - if (to->engine == from->engine) + if (is_power_of_2(to->execution_mask | READ_ONCE(from->execution_mask))) ret = i915_sw_fence_await_sw_fence_gfp(&to->submit, &from->submit, I915_FENCE_GFP); From patchwork Mon May 25 20:28:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11569433 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E87C60D for ; Mon, 25 May 2020 20:29:14 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 88047206D5 for ; Mon, 25 May 2020 20:29:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88047206D5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D355B89DCF; Mon, 25 May 2020 20:29:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 67B1889DCF for ; Mon, 25 May 2020 20:29:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21290260-1500050 for multiple; Mon, 25 May 2020 21:28:58 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 25 May 2020 21:28:59 +0100 Message-Id: <20200525202901.32244-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200525202901.32244-1-chris@chris-wilson.co.uk> References: <20200525202901.32244-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/4] drm/i915/gt: Use virtual_engine during execlists_dequeue X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Rather than going back and forth between the rb_node entry and the virtual_engine type, store the ve local and reuse it. As the container_of conversion from rb_node to virtual_engine requires a variable offset, performing that conversion just once shaves off a bit of code. v2: Keep a single virtual engine lookup, for typical use. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 217 +++++++++++++--------------- 1 file changed, 104 insertions(+), 113 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index de5be57ed6d2..b3dc88ac759d 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -451,7 +451,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists) static inline bool need_preempt(const struct intel_engine_cs *engine, const struct i915_request *rq, - struct rb_node *rb) + struct virtual_engine *ve) { int last_prio; @@ -488,9 +488,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, rq_prio(list_next_entry(rq, sched.link)) > last_prio) return true; - if (rb) { - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + if (ve) { bool preempt = false; if (engine == ve->siblings[0]) { /* only preempt one sibling */ @@ -1812,6 +1810,35 @@ static bool virtual_matches(const struct virtual_engine *ve, return true; } +static struct virtual_engine * +first_virtual_engine(struct intel_engine_cs *engine) +{ + struct intel_engine_execlists *el = &engine->execlists; + struct rb_node *rb = rb_first_cached(&el->virtual); + + while (rb) { + struct virtual_engine *ve = + rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + struct i915_request *rq = READ_ONCE(ve->request); + + if (!rq) { /* lazily cleanup after another engine handled rq */ + rb_erase_cached(rb, &el->virtual); + RB_CLEAR_NODE(rb); + rb = rb_first_cached(&el->virtual); + continue; + } + + if (!virtual_matches(ve, rq, engine)) { + rb = rb_next(rb); + continue; + } + + return ve; + } + + return NULL; +} + static void virtual_xfer_breadcrumbs(struct virtual_engine *ve) { /* @@ -1896,7 +1923,7 @@ static void defer_active(struct intel_engine_cs *engine) static bool need_timeslice(const struct intel_engine_cs *engine, const struct i915_request *rq, - const struct rb_node *rb) + struct virtual_engine *ve) { int hint; @@ -1905,9 +1932,7 @@ need_timeslice(const struct intel_engine_cs *engine, hint = engine->execlists.queue_priority_hint; - if (rb) { - const struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + if (ve) { const struct intel_engine_cs *inflight = intel_context_inflight(&ve->context); @@ -2057,7 +2082,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) struct intel_engine_execlists * const execlists = &engine->execlists; struct i915_request **port = execlists->pending; struct i915_request ** const last_port = port + execlists->port_mask; - struct i915_request * const *active; + struct i915_request * const *active = READ_ONCE(execlists->active); + struct virtual_engine *ve = first_virtual_engine(engine); struct i915_request *last; struct rb_node *rb; bool submit = false; @@ -2084,26 +2110,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * and context switches) submission. */ - for (rb = rb_first_cached(&execlists->virtual); rb; ) { - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); - struct i915_request *rq = READ_ONCE(ve->request); - - if (!rq) { /* lazily cleanup after another engine handled rq */ - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); - rb = rb_first_cached(&execlists->virtual); - continue; - } - - if (!virtual_matches(ve, rq, engine)) { - rb = rb_next(rb); - continue; - } - - break; - } - /* * If the queue is higher priority than the last * request in the currently active context, submit afresh. @@ -2111,10 +2117,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * the active context to interject the preemption request, * i.e. we will retrigger preemption following the ack in case * of trouble. - */ - active = READ_ONCE(execlists->active); - - /* + * * In theory we can skip over completed contexts that have not * yet been processed by events (as those events are in flight): * @@ -2125,9 +2128,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * find itself trying to jump back into a context it has just * completed and barf. */ - if ((last = *active)) { - if (need_preempt(engine, last, rb)) { + if (need_preempt(engine, last, ve)) { if (i915_request_completed(last)) { tasklet_hi_schedule(&execlists->tasklet); return; @@ -2158,7 +2160,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) __unwind_incomplete_requests(engine); last = NULL; - } else if (need_timeslice(engine, last, rb) && + } else if (need_timeslice(engine, last, ve) && timeslice_expired(execlists, last)) { if (i915_request_completed(last)) { tasklet_hi_schedule(&execlists->tasklet); @@ -2212,110 +2214,99 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } } - while (rb) { /* XXX virtual is always taking precedence */ - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + while (ve) { /* XXX virtual is always taking precedence */ struct i915_request *rq; spin_lock(&ve->base.active.lock); rq = ve->request; - if (unlikely(!rq)) { /* lost the race to a sibling */ - spin_unlock(&ve->base.active.lock); - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); - rb = rb_first_cached(&execlists->virtual); - continue; - } + if (unlikely(!rq)) /* lost the race to a sibling */ + goto unlock; GEM_BUG_ON(rq != ve->request); GEM_BUG_ON(rq->engine != &ve->base); GEM_BUG_ON(rq->context != &ve->context); - if (rq_prio(rq) >= queue_prio(execlists)) { - if (!virtual_matches(ve, rq, engine)) { - spin_unlock(&ve->base.active.lock); - rb = rb_next(rb); - continue; - } + if (unlikely(rq_prio(rq) < queue_prio(execlists))) { + spin_unlock(&ve->base.active.lock); + break; + } - if (last && !can_merge_rq(last, rq)) { - spin_unlock(&ve->base.active.lock); - start_timeslice(engine, rq_prio(rq)); - return; /* leave this for another sibling */ - } + GEM_BUG_ON(!virtual_matches(ve, rq, engine)); - ENGINE_TRACE(engine, - "virtual rq=%llx:%lld%s, new engine? %s\n", - rq->fence.context, - rq->fence.seqno, - i915_request_completed(rq) ? "!" : - i915_request_started(rq) ? "*" : - "", - yesno(engine != ve->siblings[0])); - - WRITE_ONCE(ve->request, NULL); - WRITE_ONCE(ve->base.execlists.queue_priority_hint, - INT_MIN); - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); + if (last && !can_merge_rq(last, rq)) { + spin_unlock(&ve->base.active.lock); + start_timeslice(engine, rq_prio(rq)); + return; /* leave this for another sibling */ + } - GEM_BUG_ON(!(rq->execution_mask & engine->mask)); - WRITE_ONCE(rq->engine, engine); + ENGINE_TRACE(engine, + "virtual rq=%llx:%lld%s, new engine? %s\n", + rq->fence.context, + rq->fence.seqno, + i915_request_completed(rq) ? "!" : + i915_request_started(rq) ? "*" : + "", + yesno(engine != ve->siblings[0])); - if (engine != ve->siblings[0]) { - u32 *regs = ve->context.lrc_reg_state; - unsigned int n; + WRITE_ONCE(ve->request, NULL); + WRITE_ONCE(ve->base.execlists.queue_priority_hint, + INT_MIN); - GEM_BUG_ON(READ_ONCE(ve->context.inflight)); + rb = &ve->nodes[engine->id].rb; + rb_erase_cached(rb, &execlists->virtual); + RB_CLEAR_NODE(rb); - if (!intel_engine_has_relative_mmio(engine)) - virtual_update_register_offsets(regs, - engine); + GEM_BUG_ON(!(rq->execution_mask & engine->mask)); + WRITE_ONCE(rq->engine, engine); - if (!list_empty(&ve->context.signals)) - virtual_xfer_breadcrumbs(ve); + if (engine != ve->siblings[0]) { + u32 *regs = ve->context.lrc_reg_state; + unsigned int n; - /* - * Move the bound engine to the top of the list - * for future execution. We then kick this - * tasklet first before checking others, so that - * we preferentially reuse this set of bound - * registers. - */ - for (n = 1; n < ve->num_siblings; n++) { - if (ve->siblings[n] == engine) { - swap(ve->siblings[n], - ve->siblings[0]); - break; - } - } + GEM_BUG_ON(READ_ONCE(ve->context.inflight)); - GEM_BUG_ON(ve->siblings[0] != engine); - } + if (!intel_engine_has_relative_mmio(engine)) + virtual_update_register_offsets(regs, + engine); - if (__i915_request_submit(rq)) { - submit = true; - last = rq; - } - i915_request_put(rq); + if (!list_empty(&ve->context.signals)) + virtual_xfer_breadcrumbs(ve); /* - * Hmm, we have a bunch of virtual engine requests, - * but the first one was already completed (thanks - * preempt-to-busy!). Keep looking at the veng queue - * until we have no more relevant requests (i.e. - * the normal submit queue has higher priority). + * Move the bound engine to the top of the list for + * future execution. We then kick this tasklet first + * before checking others, so that we preferentially + * reuse this set of bound registers. */ - if (!submit) { - spin_unlock(&ve->base.active.lock); - rb = rb_first_cached(&execlists->virtual); - continue; + for (n = 1; n < ve->num_siblings; n++) { + if (ve->siblings[n] == engine) { + swap(ve->siblings[n], + ve->siblings[0]); + break; + } } + + GEM_BUG_ON(ve->siblings[0] != engine); } + if (__i915_request_submit(rq)) { + submit = true; + last = rq; + } + + i915_request_put(rq); +unlock: spin_unlock(&ve->base.active.lock); - break; + + /* + * Hmm, we have a bunch of virtual engine requests, + * but the first one was already completed (thanks + * preempt-to-busy!). Keep looking at the veng queue + * until we have no more relevant requests (i.e. + * the normal submit queue has higher priority). + */ + ve = submit ? NULL : first_virtual_engine(engine); } while ((rb = rb_first_cached(&execlists->queue))) { From patchwork Mon May 25 20:29:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11569431 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98A4360D for ; Mon, 25 May 2020 20:29:12 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8160E206D5 for ; Mon, 25 May 2020 20:29:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8160E206D5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9BB6789DC7; Mon, 25 May 2020 20:29:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6068889DC7 for ; Mon, 25 May 2020 20:29:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21290261-1500050 for multiple; Mon, 25 May 2020 21:28:58 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 25 May 2020 21:29:00 +0100 Message-Id: <20200525202901.32244-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200525202901.32244-1-chris@chris-wilson.co.uk> References: <20200525202901.32244-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/4] drm/i915/gt: Decouple inflight virtual engines X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Once a virtual engine has been bound to a sibling, it will remain bound until we finally schedule out the last active request. We can not rebind the context to a new sibling while it is inflight as the context save will conflict, hence we wait. As we cannot then use any other sibliing while the context is inflight, only kick the bound sibling while it inflight and upon scheduling out the kick the rest (so that we can swap engines on timeslicing if the previously bound engine becomes oversubscribed). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 30 +++++++++++++---------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index b3dc88ac759d..665d62c3a54d 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1398,9 +1398,8 @@ execlists_schedule_in(struct i915_request *rq, int idx) static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); - struct i915_request *next = READ_ONCE(ve->request); - if (next == rq || (next && next->execution_mask & ~rq->execution_mask)) + if (READ_ONCE(ve->request)) tasklet_hi_schedule(&ve->base.execlists.tasklet); } @@ -1821,18 +1820,14 @@ first_virtual_engine(struct intel_engine_cs *engine) rb_entry(rb, typeof(*ve), nodes[engine->id].rb); struct i915_request *rq = READ_ONCE(ve->request); - if (!rq) { /* lazily cleanup after another engine handled rq */ + /* lazily cleanup after another engine handled rq */ + if (!rq || !virtual_matches(ve, rq, engine)) { rb_erase_cached(rb, &el->virtual); RB_CLEAR_NODE(rb); rb = rb_first_cached(&el->virtual); continue; } - if (!virtual_matches(ve, rq, engine)) { - rb = rb_next(rb); - continue; - } - return ve; } @@ -5468,7 +5463,6 @@ static void virtual_submission_tasklet(unsigned long data) if (unlikely(!mask)) return; - local_irq_disable(); for (n = 0; n < ve->num_siblings; n++) { struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]); struct ve_node * const node = &ve->nodes[sibling->id]; @@ -5478,20 +5472,19 @@ static void virtual_submission_tasklet(unsigned long data) if (!READ_ONCE(ve->request)) break; /* already handled by a sibling's tasklet */ + spin_lock_irq(&sibling->active.lock); + if (unlikely(!(mask & sibling->mask))) { if (!RB_EMPTY_NODE(&node->rb)) { - spin_lock(&sibling->active.lock); rb_erase_cached(&node->rb, &sibling->execlists.virtual); RB_CLEAR_NODE(&node->rb); - spin_unlock(&sibling->active.lock); } - continue; - } - spin_lock(&sibling->active.lock); + goto unlock_engine; + } - if (!RB_EMPTY_NODE(&node->rb)) { + if (unlikely(!RB_EMPTY_NODE(&node->rb))) { /* * Cheat and avoid rebalancing the tree if we can * reuse this node in situ. @@ -5531,9 +5524,12 @@ static void virtual_submission_tasklet(unsigned long data) if (first && prio > sibling->execlists.queue_priority_hint) tasklet_hi_schedule(&sibling->execlists.tasklet); - spin_unlock(&sibling->active.lock); +unlock_engine: + spin_unlock_irq(&sibling->active.lock); + + if (intel_context_inflight(&ve->context)) + break; } - local_irq_enable(); } static void virtual_submit_request(struct i915_request *rq) From patchwork Mon May 25 20:29:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11569437 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C244860D for ; Mon, 25 May 2020 20:29:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB64120776 for ; Mon, 25 May 2020 20:29:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB64120776 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 39C0089F07; Mon, 25 May 2020 20:29:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7A58A89F07 for ; Mon, 25 May 2020 20:29:40 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21290262-1500050 for multiple; Mon, 25 May 2020 21:28:59 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 25 May 2020 21:29:01 +0100 Message-Id: <20200525202901.32244-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200525202901.32244-1-chris@chris-wilson.co.uk> References: <20200525202901.32244-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/4] drm/i915/gt: Resubmit the virtual engine on schedule-out X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Having recognised that we do not change the sibling until we schedule out, we can then defer the decision to resubmit the virtual engine from the unwind of the active queue to scheduling out of the virtual context. By keeping the unwind order intact on the local engine, we can preserve data dependency ordering while doing a preempt-to-busy pass until we have determined the new ELSP. This means that if we try to timeslice between a virtual engine and a data-dependent ordinary request, the pair will maintain their relative ordering and we will avoid the resubmission, cancelling the timeslicing until further change. The dilemma though is that we then may end up in a situation where the 'demotion' of the virtual request to an ordinary request in the engine queue results in filling the ELSP[] with virtual requests instead of spreading the load across the engines. To compensate for this, we mark each virtual request and refuse to resubmit a virtual request in the secondary ELSP slots, thus forcing subsequent virtual requests to be scheduled out after timeslicing. By delaying the decision until we schedule out, we will avoid unnecessary resubmission. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 111 ++++++++++++++++--------- drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 +- 2 files changed, 74 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 665d62c3a54d..a2cc6a370f95 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1114,46 +1114,17 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) __i915_request_unsubmit(rq); - /* - * Push the request back into the queue for later resubmission. - * If this request is not native to this physical engine (i.e. - * it came from a virtual source), push it back onto the virtual - * engine so that it can be moved across onto another physical - * engine as load dictates. - */ - if (likely(rq->execution_mask == engine->mask)) { - GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); - if (rq_prio(rq) != prio) { - prio = rq_prio(rq); - pl = i915_sched_lookup_priolist(engine, prio); - } - GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - - list_move(&rq->sched.link, pl); - set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); + GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); + if (rq_prio(rq) != prio) { + prio = rq_prio(rq); + pl = i915_sched_lookup_priolist(engine, prio); + } + GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - active = rq; - } else { - struct intel_engine_cs *owner = rq->context->engine; + list_move(&rq->sched.link, pl); + set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); - /* - * Decouple the virtual breadcrumb before moving it - * back to the virtual engine -- we don't want the - * request to complete in the background and try - * and cancel the breadcrumb on the virtual engine - * (instead of the old engine where it is linked)! - */ - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - &rq->fence.flags)) { - spin_lock_nested(&rq->lock, - SINGLE_DEPTH_NESTING); - i915_request_cancel_breadcrumb(rq); - spin_unlock(&rq->lock); - } - WRITE_ONCE(rq->engine, owner); - owner->submit_request(rq); - active = NULL; - } + active = rq; } return active; @@ -1395,12 +1366,53 @@ execlists_schedule_in(struct i915_request *rq, int idx) return i915_request_get(rq); } +static void +resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve) +{ + struct intel_engine_cs *engine = rq->engine; + + /* + * Note that although __execlists_schedule_out() may be called from + * inside execlists_dequeue (under the spinlock), it can only do so + * as a result of request completion, and a completed request is + * not resubmitted. + */ + spin_lock_irq(&engine->active.lock); + + /* + * Decouple the virtual breadcrumb before moving it back to the virtual + * engine -- we don't want the request to complete in the background + * and then try and cancel the breadcrumb on the virtual engine + * (instead of the old engine where it is linked)! + */ + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) { + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING); + i915_request_cancel_breadcrumb(rq); + spin_unlock(&rq->lock); + } + + WRITE_ONCE(rq->engine, &ve->base); + ve->base.submit_request(rq); + + spin_unlock_irq(&engine->active.lock); +} + static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); if (READ_ONCE(ve->request)) tasklet_hi_schedule(&ve->base.execlists.tasklet); + + /* + * This engine is now too busy to run this virtual request, so + * see if we can find an alternative engine for it to execute on. + * Once a request has become bonded to this engine, we treat it the + * same as other native request. + */ + if (i915_request_in_priority_queue(rq) && + rq->execution_mask != rq->engine->mask) + resubmit_virtual_request(rq, ve); } static inline void @@ -1646,6 +1658,20 @@ assert_pending_valid(const struct intel_engine_execlists *execlists, return false; } + /* + * We want virtual requests to only be in the first slot so + * that they are never stuck behind a hog and can be immediately + * transferred onto the next idle engine. + */ + if (rq->execution_mask != engine->mask && + port != execlists->pending) { + GEM_TRACE_ERR("%s: virtual engine:%llx not in prime position[%zd]\n", + engine->name, + ce->timeline->fence_context, + port - execlists->pending); + return false; + } + /* Hold tightly onto the lock to prevent concurrent retires! */ if (!spin_trylock_irqsave(&rq->lock, flags)) continue; @@ -2343,6 +2369,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (i915_request_has_sentinel(last)) goto done; + /* + * We avoid submitting virtual requests into + * the secondary ports so that we can migrate + * the request immediately to another engine + * rather than wait for the primary request. + */ + if (rq->execution_mask != engine->mask) + goto done; + /* * If GVT overrides us we only ever submit * port[0], leaving port[1] empty. Note that we diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 66f710b1b61e..c2ae91eee701 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4384,7 +4384,7 @@ static int reset_virtual_engine(struct intel_gt *gt, spin_lock_irq(&engine->active.lock); __unwind_incomplete_requests(engine); spin_unlock_irq(&engine->active.lock); - GEM_BUG_ON(rq->engine != ve->engine); + GEM_BUG_ON(rq->engine != engine); /* Reset the engine while keeping our active request on hold */ execlists_hold(engine, rq);