From patchwork Mon May 18 08:14:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11554975 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 76D9360D for ; Mon, 18 May 2020 08:14:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5F00F207D8 for ; Mon, 18 May 2020 08:14:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5F00F207D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A2BA76E17F; Mon, 18 May 2020 08:14:50 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7B61A6E143 for ; Mon, 18 May 2020 08:14:48 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21226276-1500050 for multiple; Mon, 18 May 2020 09:14:43 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 18 May 2020 09:14:40 +0100 Message-Id: <20200518081440.17948-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200518081440.17948-1-chris@chris-wilson.co.uk> References: <20200518081440.17948-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 8/8] drm/i915/gt: Resubmit the virtual engine on schedule-out X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Having recognised that we do not change the sibling until we schedule out, we can then defer the decision to resubmit the virtual engine from the unwind of the active queue to scheduling out of the virtual context. By keeping the unwind order intact on the local engine, we can preserve data dependency ordering while doing a preempt-to-busy pass until we have determined the new ELSP. This means that if we try to timeslice between a virtual engine and a data-dependent ordinary request, the pair will maintain their relative ordering and we will avoid the resubmission, cancelling the timeslicing until further change. The dilemma though is that we then may end up in a situation where the 'demotion' of the virtual request to an ordinary request in the engine queue results in filling the ELSP[] with virtual requests instead of spreading the load across the engines. To compensate for this, we mark each virtual request and refuse to resubmit a virtual request in the secondary ELSP slots, thus forcing subsequent virtual requests to be scheduled out after timeslicing. By delaying the decision until we schedule out, we will avoid unnecessary resubmission. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 99 ++++++++++++++++---------- drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 +- 2 files changed, 62 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fe8f3518d6b8..a0e337b855e3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1114,46 +1114,17 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) __i915_request_unsubmit(rq); - /* - * Push the request back into the queue for later resubmission. - * If this request is not native to this physical engine (i.e. - * it came from a virtual source), push it back onto the virtual - * engine so that it can be moved across onto another physical - * engine as load dictates. - */ - if (likely(rq->execution_mask == engine->mask)) { - GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); - if (rq_prio(rq) != prio) { - prio = rq_prio(rq); - pl = i915_sched_lookup_priolist(engine, prio); - } - GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - - list_move(&rq->sched.link, pl); - set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); + GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); + if (rq_prio(rq) != prio) { + prio = rq_prio(rq); + pl = i915_sched_lookup_priolist(engine, prio); + } + GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - active = rq; - } else { - struct intel_engine_cs *owner = rq->context->engine; + list_move(&rq->sched.link, pl); + set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); - /* - * Decouple the virtual breadcrumb before moving it - * back to the virtual engine -- we don't want the - * request to complete in the background and try - * and cancel the breadcrumb on the virtual engine - * (instead of the old engine where it is linked)! - */ - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - &rq->fence.flags)) { - spin_lock_nested(&rq->lock, - SINGLE_DEPTH_NESTING); - i915_request_cancel_breadcrumb(rq); - spin_unlock(&rq->lock); - } - WRITE_ONCE(rq->engine, owner); - owner->submit_request(rq); - active = NULL; - } + active = rq; } return active; @@ -1395,12 +1366,41 @@ execlists_schedule_in(struct i915_request *rq, int idx) return i915_request_get(rq); } +static void +resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve) +{ + /* + * Decouple the virtual breadcrumb before moving it back to the virtual + * engine -- we don't want the request to complete in the background + * and then try and cancel the breadcrumb on the virtual engine + * (instead of the old engine where it is linked)! + */ + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) { + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING); + i915_request_cancel_breadcrumb(rq); + spin_unlock(&rq->lock); + } + + WRITE_ONCE(rq->engine, &ve->base); + ve->base.submit_request(rq); +} + static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); if (READ_ONCE(ve->request)) tasklet_hi_schedule(&ve->base.execlists.tasklet); + + /* + * This engine is now too busy to run this virtual request, so + * see if we can find an alternative engine for it to execute on. + * Once a request has become bonded to this engine, we treat it the + * same as other native request. + */ + if (i915_request_in_priority_queue(rq) && + rq->execution_mask != rq->engine->mask) + resubmit_virtual_request(rq, ve); } static inline void @@ -1646,6 +1646,20 @@ assert_pending_valid(const struct intel_engine_execlists *execlists, return false; } + /* + * We want virtual requests to only be in the first slot so + * that they are never stuck behind a hog and can be immediately + * transferred onto the next idle engine. + */ + if (rq->execution_mask != engine->mask && + port != execlists->pending) { + GEM_TRACE_ERR("%s: virtual engine:%llx not in prime position[%zd]\n", + engine->name, + ce->timeline->fence_context, + port - execlists->pending); + return false; + } + /* Hold tightly onto the lock to prevent concurrent retires! */ if (!spin_trylock_irqsave(&rq->lock, flags)) continue; @@ -2353,6 +2367,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (i915_request_has_sentinel(last)) goto done; + /* + * We avoid submitting virtual requests into + * the secondary ports so that we can migrate + * the request immediately to another engine + * rather than wait for the primary request. + */ + if (rq->execution_mask != engine->mask) + goto done; + /* * If GVT overrides us we only ever submit * port[0], leaving port[1] empty. Note that we diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 1fc54359bd53..e541ff47aa30 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4393,7 +4393,7 @@ static int reset_virtual_engine(struct intel_gt *gt, spin_lock_irq(&engine->active.lock); __unwind_incomplete_requests(engine); spin_unlock_irq(&engine->active.lock); - GEM_BUG_ON(rq->engine != ve->engine); + GEM_BUG_ON(rq->engine != engine); /* Reset the engine while keeping our active request on hold */ execlists_hold(engine, rq);