From patchwork Mon Jun 15 12:39:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604755 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2563492A for ; Mon, 15 Jun 2020 12:39:39 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0E017206D7 for ; Mon, 15 Jun 2020 12:39:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E017206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 05D226E2F0; Mon, 15 Jun 2020 12:39:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 244876E0F3 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501010-1500050 for multiple; Mon, 15 Jun 2020 13:39:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:11 +0100 Message-Id: <20200615123920.17749-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/10] drm/i915/selftests: Disable preemptive heartbeats over preemption tests X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since the heartbeat may cause a preemption event, disable it over the preemption suppression tests. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index f651bdf7f191..91543494f595 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -2282,7 +2282,7 @@ static int live_suppress_self_preempt(void *arg) if (igt_flush_test(gt->i915)) goto err_wedged; - intel_engine_pm_get(engine); + engine_heartbeat_disable(engine); engine->execlists.preempt_hang.count = 0; rq_a = spinner_create_request(&a.spin, @@ -2290,14 +2290,14 @@ static int live_suppress_self_preempt(void *arg) MI_NOOP); if (IS_ERR(rq_a)) { err = PTR_ERR(rq_a); - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); goto err_client_b; } i915_request_add(rq_a); if (!igt_wait_for_spinner(&a.spin, rq_a)) { pr_err("First client failed to start\n"); - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); goto err_wedged; } @@ -2309,7 +2309,7 @@ static int live_suppress_self_preempt(void *arg) MI_NOOP); if (IS_ERR(rq_b)) { err = PTR_ERR(rq_b); - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); goto err_client_b; } i915_request_add(rq_b); @@ -2320,7 +2320,7 @@ static int live_suppress_self_preempt(void *arg) if (!igt_wait_for_spinner(&b.spin, rq_b)) { pr_err("Second client failed to start\n"); - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); goto err_wedged; } @@ -2334,12 +2334,12 @@ static int live_suppress_self_preempt(void *arg) engine->name, engine->execlists.preempt_hang.count, depth); - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); err = -EINVAL; goto err_client_b; } - intel_engine_pm_put(engine); + engine_heartbeat_enable(engine); if (igt_flush_test(gt->i915)) goto err_wedged; } From patchwork Mon Jun 15 12:39:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604761 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B626792A for ; Mon, 15 Jun 2020 12:39:41 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9ECF7206D7 for ; Mon, 15 Jun 2020 12:39:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9ECF7206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3C5686E2F9; Mon, 15 Jun 2020 12:39:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2E7FF6E301 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501011-1500050 for multiple; Mon, 15 Jun 2020 13:39:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:12 +0100 Message-Id: <20200615123920.17749-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/10] drm/i915/selftests: Dump engine state and trace upon hanging after reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the engine dies after a reset, and so we fail to submit a request but need to be interrupted by the CI runner, dump the engine state. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 2af66f8ffbd2..0f1be62cdc6f 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -499,6 +499,24 @@ static int igt_reset_nop_engine(void *arg) rq = intel_context_create_request(ce); if (IS_ERR(rq)) { + struct drm_printer p = + drm_info_printer(gt->i915->drm.dev); + intel_engine_dump(engine, &p, + "%s(%s): failed to submit request %llx:%lld\n", + __func__, + engine->name, + rq->fence.context, + rq->fence.seqno); + + + GEM_TRACE("%s(%s): failed to submit request %llx:%lld\n", + __func__, engine->name, + rq->fence.context, + rq->fence.seqno); + GEM_TRACE_DUMP(); + + intel_gt_set_wedged(gt); + err = PTR_ERR(rq); break; } From patchwork Mon Jun 15 12:39:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604753 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 29CC092A for ; Mon, 15 Jun 2020 12:39:38 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0B9F0206D7 for ; Mon, 15 Jun 2020 12:39:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0B9F0206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 008C66E2EA; Mon, 15 Jun 2020 12:39:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 256856E113 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501012-1500050 for multiple; Mon, 15 Jun 2020 13:39:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:13 +0100 Message-Id: <20200615123920.17749-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/10] drm/i915/gt: Add a safety submission flush in the heartbeat X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Just in case everything fails (like for example "missed interrupt syndrome" on Sandybridge), always flush the submission tasklet from the heartbeat. This papers over such issues, but will still appear as a second long glitch, and prevents us from detecting it unless we happen to be performing a timed test. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 19 +++++++------------ .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 3 +++ 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index d613cf31970c..81884303bf6d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1094,19 +1094,14 @@ void intel_engine_flush_submission(struct intel_engine_cs *engine) { struct tasklet_struct *t = &engine->execlists.tasklet; - if (__tasklet_is_scheduled(t)) { - local_bh_disable(); - if (tasklet_trylock(t)) { - /* Must wait for any GPU reset in progress. */ - if (__tasklet_is_enabled(t)) - t->func(t->data); - tasklet_unlock(t); - } - local_bh_enable(); + local_bh_disable(); + if (tasklet_trylock(t)) { + /* Must wait for any GPU reset in progress. */ + if (__tasklet_is_enabled(t)) + t->func(t->data); + tasklet_unlock(t); } - - /* Otherwise flush the tasklet if it was running on another cpu */ - tasklet_unlock_wait(t); + local_bh_enable(); } /** diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index f67ad937eefb..cd20fb549b38 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -65,6 +65,9 @@ static void heartbeat(struct work_struct *wrk) struct intel_context *ce = engine->kernel_context; struct i915_request *rq; + /* Just in case everything has gone horribly wrong, give it a kick */ + intel_engine_flush_submission(engine); + rq = engine->heartbeat.systole; if (rq && i915_request_completed(rq)) { i915_request_put(rq); From patchwork Mon Jun 15 12:39:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604757 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3166D90 for ; Mon, 15 Jun 2020 12:39:40 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 19E6E206D7 for ; Mon, 15 Jun 2020 12:39:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 19E6E206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 288FA6E2F8; Mon, 15 Jun 2020 12:39:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 267DD6E2EA for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501013-1500050 for multiple; Mon, 15 Jun 2020 13:39:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:14 +0100 Message-Id: <20200615123920.17749-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/10] drm/i915/execlists: Replace direct submit with direct call to tasklet X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine.h | 1 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 27 +++---- drivers/gpu/drm/i915/gt/intel_lrc.c | 78 ++++++++------------ drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 1 + 4 files changed, 45 insertions(+), 62 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 791897f8d847..c77b3c0d2b3b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -210,6 +210,7 @@ int intel_engine_resume(struct intel_engine_cs *engine); int intel_ring_submission_setup(struct intel_engine_cs *engine); +void __intel_engine_stop_cs(struct intel_engine_cs *engine); int intel_engine_stop_cs(struct intel_engine_cs *engine); void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 81884303bf6d..667ee52448a0 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -903,33 +903,34 @@ static unsigned long stop_timeout(const struct intel_engine_cs *engine) return READ_ONCE(engine->props.stop_timeout_ms); } -int intel_engine_stop_cs(struct intel_engine_cs *engine) +void __intel_engine_stop_cs(struct intel_engine_cs *engine) { struct intel_uncore *uncore = engine->uncore; - const u32 base = engine->mmio_base; - const i915_reg_t mode = RING_MI_MODE(base); - int err; + const i915_reg_t mode = RING_MI_MODE(engine->mmio_base); + intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING)); + intel_uncore_posting_read_fw(uncore, mode); +} + +int intel_engine_stop_cs(struct intel_engine_cs *engine) +{ if (INTEL_GEN(engine->i915) < 3) return -ENODEV; ENGINE_TRACE(engine, "\n"); - intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING)); + __intel_engine_stop_cs(engine); - err = 0; - if (__intel_wait_for_register_fw(uncore, - mode, MODE_IDLE, MODE_IDLE, + if (__intel_wait_for_register_fw(engine->uncore, + RING_MI_MODE(engine->mmio_base), + MODE_IDLE, MODE_IDLE, 1000, stop_timeout(engine), NULL)) { ENGINE_TRACE(engine, "timed out on STOP_RING -> IDLE\n"); - err = -ETIMEDOUT; + return -ETIMEDOUT; } - /* A final mmio read to let GPU writes be hopefully flushed to memory */ - intel_uncore_posting_read_fw(uncore, mode); - - return err; + return 0; } void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index e866b8d721ed..40c5085765da 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2703,16 +2703,6 @@ static void process_csb(struct intel_engine_cs *engine) invalidate_csb_entries(&buf[0], &buf[num_entries - 1]); } -static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) -{ - lockdep_assert_held(&engine->active.lock); - if (!READ_ONCE(engine->execlists.pending[0])) { - rcu_read_lock(); /* protect peeking at execlists->active */ - execlists_dequeue(engine); - rcu_read_unlock(); - } -} - static void __execlists_hold(struct i915_request *rq) { LIST_HEAD(list); @@ -3102,7 +3092,7 @@ static bool preempt_timeout(const struct intel_engine_cs *const engine) if (!timer_expired(t)) return false; - return READ_ONCE(engine->execlists.pending[0]); + return engine->execlists.pending[0]; } /* @@ -3112,7 +3102,6 @@ static bool preempt_timeout(const struct intel_engine_cs *const engine) static void execlists_submission_tasklet(unsigned long data) { struct intel_engine_cs * const engine = (struct intel_engine_cs *)data; - bool timeout = preempt_timeout(engine); process_csb(engine); @@ -3122,16 +3111,17 @@ static void execlists_submission_tasklet(unsigned long data) execlists_reset(engine, "CS error"); } - if (!READ_ONCE(engine->execlists.pending[0]) || timeout) { + if (unlikely(preempt_timeout(engine))) + execlists_reset(engine, "preemption time out"); + + if (!engine->execlists.pending[0]) { unsigned long flags; + rcu_read_lock(); /* protect peeking at execlists->active */ spin_lock_irqsave(&engine->active.lock, flags); - __execlists_submission_tasklet(engine); + execlists_dequeue(engine); spin_unlock_irqrestore(&engine->active.lock, flags); - - /* Recheck after serialising with direct-submission */ - if (unlikely(timeout && preempt_timeout(engine))) - execlists_reset(engine, "preemption time out"); + rcu_read_unlock(); } } @@ -3163,26 +3153,16 @@ static void queue_request(struct intel_engine_cs *engine, set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); } -static void __submit_queue_imm(struct intel_engine_cs *engine) -{ - struct intel_engine_execlists * const execlists = &engine->execlists; - - if (reset_in_progress(execlists)) - return; /* defer until we restart the engine following reset */ - - __execlists_submission_tasklet(engine); -} - -static void submit_queue(struct intel_engine_cs *engine, +static bool submit_queue(struct intel_engine_cs *engine, const struct i915_request *rq) { struct intel_engine_execlists *execlists = &engine->execlists; if (rq_prio(rq) <= execlists->queue_priority_hint) - return; + return false; execlists->queue_priority_hint = rq_prio(rq); - __submit_queue_imm(engine); + return true; } static bool ancestor_on_hold(const struct intel_engine_cs *engine, @@ -3196,20 +3176,22 @@ static void flush_csb(struct intel_engine_cs *engine) { struct intel_engine_execlists *el = &engine->execlists; - if (READ_ONCE(el->pending[0]) && tasklet_trylock(&el->tasklet)) { - if (!reset_in_progress(el)) - process_csb(engine); - tasklet_unlock(&el->tasklet); + if (!tasklet_trylock(&el->tasklet)) { + tasklet_hi_schedule(&el->tasklet); + return; } + + if (!reset_in_progress(el)) + execlists_submission_tasklet((unsigned long)engine); + + tasklet_unlock(&el->tasklet); } static void execlists_submit_request(struct i915_request *request) { struct intel_engine_cs *engine = request->engine; unsigned long flags; - - /* Hopefully we clear execlists->pending[] to let us through */ - flush_csb(engine); + bool submit = false; /* Will be called from irq-context when using foreign fences. */ spin_lock_irqsave(&engine->active.lock, flags); @@ -3224,10 +3206,13 @@ static void execlists_submit_request(struct i915_request *request) GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); GEM_BUG_ON(list_empty(&request->sched.link)); - submit_queue(engine, request); + submit = submit_queue(engine, request); } spin_unlock_irqrestore(&engine->active.lock, flags); + + if (submit) + flush_csb(engine); } static void __execlists_context_fini(struct intel_context *ce) @@ -4113,7 +4098,6 @@ static int execlists_resume(struct intel_engine_cs *engine) static void execlists_reset_prepare(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = &engine->execlists; - unsigned long flags; ENGINE_TRACE(engine, "depth<-%d\n", atomic_read(&execlists->tasklet.count)); @@ -4130,10 +4114,6 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine) __tasklet_disable_sync_once(&execlists->tasklet); GEM_BUG_ON(!reset_in_progress(execlists)); - /* And flush any current direct submission. */ - spin_lock_irqsave(&engine->active.lock, flags); - spin_unlock_irqrestore(&engine->active.lock, flags); - /* * We stop engines, otherwise we might get failed reset and a * dead gpu (on elk). Also as modern gpu as kbl can suffer @@ -4147,7 +4127,7 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine) * FIXME: Wa for more modern gens needs to be validated */ ring_set_paused(engine, 1); - intel_engine_stop_cs(engine); + __intel_engine_stop_cs(engine); engine->execlists.reset_ccid = active_ccid(engine); } @@ -4377,12 +4357,12 @@ static void execlists_reset_finish(struct intel_engine_cs *engine) * to sleep before we restart and reload a context. */ GEM_BUG_ON(!reset_in_progress(execlists)); - if (!RB_EMPTY_ROOT(&execlists->queue.rb_root)) - execlists->tasklet.func(execlists->tasklet.data); + GEM_BUG_ON(engine->execlists.pending[0]); + /* And kick in case we missed a new request submission. */ if (__tasklet_enable(&execlists->tasklet)) - /* And kick in case we missed a new request submission. */ - tasklet_hi_schedule(&execlists->tasklet); + flush_csb(engine); + ENGINE_TRACE(engine, "depth->%d\n", atomic_read(&execlists->tasklet.count)); } diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 0f1be62cdc6f..1f26db4fc998 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -1601,6 +1601,7 @@ static int __igt_atomic_reset_engine(struct intel_engine_cs *engine, p->critical_section_end(); tasklet_enable(t); + tasklet_hi_schedule(t); if (err) pr_err("i915_reset_engine(%s:%s) failed under %s\n", From patchwork Mon Jun 15 12:39:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604751 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC90790 for ; Mon, 15 Jun 2020 12:39:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C4764206D7 for ; Mon, 15 Jun 2020 12:39:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C4764206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B6FB26E113; Mon, 15 Jun 2020 12:39:31 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 24EA56E10F for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501014-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:15 +0100 Message-Id: <20200615123920.17749-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/10] drm/i915/execlists: Defer schedule_out until after the next dequeue X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Inside schedule_out, we do extra work upon idling the context, such as updating the runtime, kicking off retires, kicking virtual engines. However, if we are in a series of processing single requests per contexts, we may find ourselves scheduling out the context, only to immediately schedule it back in during dequeue. This is just extra work that we can avoid if we keep the context marked as inflight across the dequeue. This becomes more significant later on for minimising virtual engine misses. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 13 +++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 47 ++++++++++++++----- 4 files changed, 51 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 4954b0df4864..b63db45bab7b 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -45,8 +45,8 @@ struct intel_context { struct intel_engine_cs *engine; struct intel_engine_cs *inflight; -#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 2) -#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 2) +#define intel_context_inflight(ce) ptr_mask_bits(READ_ONCE((ce)->inflight), 3) +#define intel_context_inflight_count(ce) ptr_unmask_bits(READ_ONCE((ce)->inflight), 3) struct i915_address_space *vm; struct i915_gem_context __rcu *gem_context; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 667ee52448a0..0e94e52ee760 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -515,6 +515,8 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine) memset(execlists->pending, 0, sizeof(execlists->pending)); execlists->active = memset(execlists->inflight, 0, sizeof(execlists->inflight)); + execlists->inactive = + memset(execlists->post, 0, sizeof(execlists->post)); execlists->queue_priority_hint = INT_MIN; execlists->queue = RB_ROOT_CACHED; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 073c3769e8cc..31cf60cef5a8 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -208,6 +208,10 @@ struct intel_engine_execlists { * @active: the currently known context executing on HW */ struct i915_request * const *active; + /** + * @inactive: the current vacancy of completed CS + */ + struct i915_request **inactive; /** * @inflight: the set of contexts submitted and acknowleged by HW * @@ -225,6 +229,15 @@ struct intel_engine_execlists { * preemption or idle-to-active event. */ struct i915_request *pending[EXECLIST_MAX_PORTS + 1]; + /** + * @post: the set of completed context switches + * + * Since we may want to stagger the processing of the CS switches + * with the next submission, so that the context are notionally + * kept in flight across the dequeue, we defer scheduling out of + * the completed context switches. + */ + struct i915_request *post[2 * EXECLIST_MAX_PORTS + 1]; /** * @port_mask: number of execlist ports - 1 diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 40c5085765da..adc14adfa89c 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1385,6 +1385,8 @@ __execlists_schedule_in(struct i915_request *rq) execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); intel_engine_context_in(engine); + CE_TRACE(ce, "schedule-in, ccid:%x\n", ce->lrc.ccid); + return engine; } @@ -1431,6 +1433,8 @@ __execlists_schedule_out(struct i915_request *rq, * refrain from doing non-trivial work here. */ + CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid); + /* * If we have just completed this context, the engine may now be * idle and we want to re-enter powersaving. @@ -2055,9 +2059,10 @@ static void set_preempt_timeout(struct intel_engine_cs *engine, active_preempt_timeout(engine, rq)); } -static inline void clear_ports(struct i915_request **ports, int count) +static inline struct i915_request ** +clear_ports(struct i915_request **ports, int count) { - memset_p((void **)ports, NULL, count); + return memset_p((void **)ports, NULL, count); } static void execlists_dequeue(struct intel_engine_cs *engine) @@ -2433,7 +2438,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (!memcmp(active, execlists->pending, (port - execlists->pending + 1) * sizeof(*port))) { do - execlists_schedule_out(fetch_and_zero(port)); + *execlists->inactive++ = *port; while (port-- != execlists->pending); goto skip_submit; @@ -2447,6 +2452,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) start_timeslice(engine, execlists->queue_priority_hint); skip_submit: ring_set_paused(engine, 0); + execlists->pending[0] = NULL; } } @@ -2456,12 +2462,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists) struct i915_request * const *port; for (port = execlists->pending; *port; port++) - execlists_schedule_out(*port); + *execlists->inactive++ = *port; clear_ports(execlists->pending, ARRAY_SIZE(execlists->pending)); /* Mark the end of active before we overwrite *active */ for (port = xchg(&execlists->active, execlists->pending); *port; port++) - execlists_schedule_out(*port); + *execlists->inactive++ = *port; clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight)); smp_wmb(); /* complete the seqlock for execlists_active() */ @@ -2622,7 +2628,7 @@ static void process_csb(struct intel_engine_cs *engine) /* cancel old inflight, prepare for switch */ trace_ports(execlists, "preempted", old); while (*old) - execlists_schedule_out(*old++); + *execlists->inactive++ = *old++; /* switch pending to inflight */ GEM_BUG_ON(!assert_pending_valid(execlists, "promote")); @@ -2679,7 +2685,7 @@ static void process_csb(struct intel_engine_cs *engine) regs[CTX_RING_TAIL]); } - execlists_schedule_out(*execlists->active++); + *execlists->inactive++ = *execlists->active++; GEM_BUG_ON(execlists->active - execlists->inflight > execlists_num_ports(execlists)); @@ -2703,6 +2709,20 @@ static void process_csb(struct intel_engine_cs *engine) invalidate_csb_entries(&buf[0], &buf[num_entries - 1]); } +static void post_process_csb(struct intel_engine_cs *engine) +{ + struct intel_engine_execlists * const el = &engine->execlists; + struct i915_request **port; + + if (!el->post[0]) + return; + + GEM_BUG_ON(el->post[2 * EXECLIST_MAX_PORTS]); + for (port = el->post; *port; port++) + execlists_schedule_out(*port); + el->inactive = clear_ports(el->post, port - el->post); +} + static void __execlists_hold(struct i915_request *rq) { LIST_HEAD(list); @@ -2971,8 +2991,8 @@ active_context(struct intel_engine_cs *engine, u32 ccid) for (port = el->active; (rq = *port); port++) { if (rq->context->lrc.ccid == ccid) { ENGINE_TRACE(engine, - "ccid found at active:%zd\n", - port - el->active); + "ccid:%x found at active:%zd\n", + ccid, port - el->active); return rq; } } @@ -2980,8 +3000,8 @@ active_context(struct intel_engine_cs *engine, u32 ccid) for (port = el->pending; (rq = *port); port++) { if (rq->context->lrc.ccid == ccid) { ENGINE_TRACE(engine, - "ccid found at pending:%zd\n", - port - el->pending); + "ccid:%x found at pending:%zd\n", + ccid, port - el->pending); return rq; } } @@ -3123,6 +3143,8 @@ static void execlists_submission_tasklet(unsigned long data) spin_unlock_irqrestore(&engine->active.lock, flags); rcu_read_unlock(); } + + post_process_csb(engine); } static void __execlists_kick(struct intel_engine_execlists *execlists) @@ -4061,8 +4083,6 @@ static void enable_execlists(struct intel_engine_cs *engine) ENGINE_POSTING_READ(engine, RING_HWS_PGA); enable_error_interrupt(engine); - - engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0); } static bool unexpected_starting_state(struct intel_engine_cs *engine) @@ -5104,6 +5124,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) else execlists->csb_size = GEN11_CSB_ENTRIES; + engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0); if (INTEL_GEN(engine->i915) >= 11) { execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32); execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32); From patchwork Mon Jun 15 12:39:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604747 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87A8192A for ; Mon, 15 Jun 2020 12:39:33 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 661FE206D7 for ; Mon, 15 Jun 2020 12:39:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 661FE206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 980D36E0F3; Mon, 15 Jun 2020 12:39:31 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 27F146E2F9 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501015-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:16 +0100 Message-Id: <20200615123920.17749-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/10] drm/i915/gt: ce->inflight updates are now serialised X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since schedule-in and schedule-out are now both always under the tasklet bitlock, we can reduce the individual atomic operations to simple instructions and worry less. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 43 ++++++++++++----------------- 1 file changed, 18 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index adc14adfa89c..8b3959207c02 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1371,7 +1371,7 @@ __execlists_schedule_in(struct i915_request *rq) unsigned int tag = ffs(READ_ONCE(engine->context_tag)); GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); - clear_bit(tag - 1, &engine->context_tag); + __clear_bit(tag - 1, &engine->context_tag); ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32); BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID); @@ -1399,13 +1399,10 @@ execlists_schedule_in(struct i915_request *rq, int idx) GEM_BUG_ON(!intel_engine_pm_is_awake(rq->engine)); trace_i915_request_in(rq, idx); - old = READ_ONCE(ce->inflight); - do { - if (!old) { - WRITE_ONCE(ce->inflight, __execlists_schedule_in(rq)); - break; - } - } while (!try_cmpxchg(&ce->inflight, &old, ptr_inc(old))); + old = ce->inflight; + if (!old) + old = __execlists_schedule_in(rq); + WRITE_ONCE(ce->inflight, ptr_inc(old)); GEM_BUG_ON(intel_context_inflight(ce) != rq->engine); return i915_request_get(rq); @@ -1420,12 +1417,11 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce) tasklet_hi_schedule(&ve->base.execlists.tasklet); } -static inline void -__execlists_schedule_out(struct i915_request *rq, - struct intel_engine_cs * const engine, - unsigned int ccid) +static inline void __execlists_schedule_out(struct i915_request *rq) { struct intel_context * const ce = rq->context; + struct intel_engine_cs * const engine = ce->inflight; + unsigned int ccid; /* * NB process_csb() is not under the engine->active.lock and hence @@ -1433,7 +1429,7 @@ __execlists_schedule_out(struct i915_request *rq, * refrain from doing non-trivial work here. */ - CE_TRACE(ce, "schedule-out, ccid:%x\n", ccid); + CE_TRACE(ce, "schedule-out, ccid:%x\n", ce->lrc.ccid); /* * If we have just completed this context, the engine may now be @@ -1443,12 +1439,13 @@ __execlists_schedule_out(struct i915_request *rq, i915_request_completed(rq)) intel_engine_add_retire(engine, ce->timeline); + ccid = ce->lrc.ccid; ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; ccid &= GEN12_MAX_CONTEXT_HW_ID; if (ccid < BITS_PER_LONG) { GEM_BUG_ON(ccid == 0); GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag)); - set_bit(ccid - 1, &engine->context_tag); + __set_bit(ccid - 1, &engine->context_tag); } intel_context_update_runtime(ce); @@ -1469,26 +1466,22 @@ __execlists_schedule_out(struct i915_request *rq, */ if (ce->engine != engine) kick_siblings(rq, ce); - - intel_context_put(ce); } static inline void execlists_schedule_out(struct i915_request *rq) { struct intel_context * const ce = rq->context; - struct intel_engine_cs *cur, *old; - u32 ccid; trace_i915_request_out(rq); - ccid = rq->context->lrc.ccid; - old = READ_ONCE(ce->inflight); - do - cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL; - while (!try_cmpxchg(&ce->inflight, &old, cur)); - if (!cur) - __execlists_schedule_out(rq, old, ccid); + GEM_BUG_ON(!ce->inflight); + ce->inflight = ptr_dec(ce->inflight); + if (!intel_context_inflight_count(ce)) { + __execlists_schedule_out(rq); + WRITE_ONCE(ce->inflight, NULL); + intel_context_put(ce); + } i915_request_put(rq); } From patchwork Mon Jun 15 12:39:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604759 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E794B92A for ; Mon, 15 Jun 2020 12:39:40 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D00AE206D7 for ; Mon, 15 Jun 2020 12:39:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D00AE206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6C0806E30F; Mon, 15 Jun 2020 12:39:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 276886E2F8 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501016-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:17 +0100 Message-Id: <20200615123920.17749-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/10] drm/i915/gt: Drop atomic for engine->fw_active tracking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since schedule-in/out is now entirely serialised by the tasklet bitlock, we do not need to worry about concurrent in/out operations and so reduce the atomic operations to plain instructions. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 0e94e52ee760..c91d18b384e7 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1524,7 +1524,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, drm_printf(m, "\tLatency: %luus\n", ewma__engine_latency_read(&engine->latency)); drm_printf(m, "\tForcewake: %x domains, %d active\n", - engine->fw_domain, atomic_read(&engine->fw_active)); + engine->fw_domain, READ_ONCE(engine->fw_active)); rcu_read_lock(); rq = READ_ONCE(engine->heartbeat.systole); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 31cf60cef5a8..ca124f229f65 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -335,7 +335,7 @@ struct intel_engine_cs { * as possible. */ enum forcewake_domains fw_domain; - atomic_t fw_active; + unsigned int fw_active; unsigned long context_tag; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 8b3959207c02..09ec7242fbcb 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1380,7 +1380,7 @@ __execlists_schedule_in(struct i915_request *rq) ce->lrc.ccid |= engine->execlists.ccid; __intel_gt_pm_get(engine->gt); - if (engine->fw_domain && !atomic_fetch_inc(&engine->fw_active)) + if (engine->fw_domain && !engine->fw_active++) intel_uncore_forcewake_get(engine->uncore, engine->fw_domain); execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); intel_engine_context_in(engine); @@ -1451,7 +1451,7 @@ static inline void __execlists_schedule_out(struct i915_request *rq) intel_context_update_runtime(ce); intel_engine_context_out(engine); execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT); - if (engine->fw_domain && !atomic_dec_return(&engine->fw_active)) + if (engine->fw_domain && !--engine->fw_active) intel_uncore_forcewake_put(engine->uncore, engine->fw_domain); intel_gt_pm_put_async(engine->gt); From patchwork Mon Jun 15 12:39:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604763 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DAD990 for ; Mon, 15 Jun 2020 12:39:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1640B2078A for ; Mon, 15 Jun 2020 12:39:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1640B2078A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9A90E6E30E; Mon, 15 Jun 2020 12:39:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D11D46E0F3 for ; Mon, 15 Jun 2020 12:39:30 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501017-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:18 +0100 Message-Id: <20200615123920.17749-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/10] drm/i915/gt: Use virtual_engine during execlists_dequeue X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Rather than going back and forth between the rb_node entry and the virtual_engine type, store the ve local and reuse it. As the container_of conversion from rb_node to virtual_engine requires a variable offset, performing that conversion just once shaves off a bit of code. v2: Keep a single virtual engine lookup, for typical use. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 214 +++++++++++++--------------- 1 file changed, 101 insertions(+), 113 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 09ec7242fbcb..34a8eadc2de3 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -454,7 +454,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists) static inline bool need_preempt(const struct intel_engine_cs *engine, const struct i915_request *rq, - struct rb_node *rb) + struct virtual_engine *ve) { int last_prio; @@ -491,9 +491,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, rq_prio(list_next_entry(rq, sched.link)) > last_prio) return true; - if (rb) { - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + if (ve) { bool preempt = false; if (engine == ve->siblings[0]) { /* only preempt one sibling */ @@ -1816,6 +1814,35 @@ static bool virtual_matches(const struct virtual_engine *ve, return true; } +static struct virtual_engine * +first_virtual_engine(struct intel_engine_cs *engine) +{ + struct intel_engine_execlists *el = &engine->execlists; + struct rb_node *rb = rb_first_cached(&el->virtual); + + while (rb) { + struct virtual_engine *ve = + rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + struct i915_request *rq = READ_ONCE(ve->request); + + if (!rq) { /* lazily cleanup after another engine handled rq */ + rb_erase_cached(rb, &el->virtual); + RB_CLEAR_NODE(rb); + rb = rb_first_cached(&el->virtual); + continue; + } + + if (!virtual_matches(ve, rq, engine)) { + rb = rb_next(rb); + continue; + } + + return ve; + } + + return NULL; +} + static void virtual_xfer_breadcrumbs(struct virtual_engine *ve) { /* @@ -1900,7 +1927,7 @@ static void defer_active(struct intel_engine_cs *engine) static bool need_timeslice(const struct intel_engine_cs *engine, const struct i915_request *rq, - const struct rb_node *rb) + struct virtual_engine *ve) { int hint; @@ -1909,9 +1936,7 @@ need_timeslice(const struct intel_engine_cs *engine, hint = engine->execlists.queue_priority_hint; - if (rb) { - const struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + if (ve) { const struct intel_engine_cs *inflight = intel_context_inflight(&ve->context); @@ -2063,7 +2088,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) struct intel_engine_execlists * const execlists = &engine->execlists; struct i915_request **port = execlists->pending; struct i915_request ** const last_port = port + execlists->port_mask; - struct i915_request * const *active; + struct i915_request * const *active = READ_ONCE(execlists->active); + struct virtual_engine *ve = first_virtual_engine(engine); struct i915_request *last; struct rb_node *rb; bool submit = false; @@ -2090,26 +2116,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * and context switches) submission. */ - for (rb = rb_first_cached(&execlists->virtual); rb; ) { - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); - struct i915_request *rq = READ_ONCE(ve->request); - - if (!rq) { /* lazily cleanup after another engine handled rq */ - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); - rb = rb_first_cached(&execlists->virtual); - continue; - } - - if (!virtual_matches(ve, rq, engine)) { - rb = rb_next(rb); - continue; - } - - break; - } - /* * If the queue is higher priority than the last * request in the currently active context, submit afresh. @@ -2117,10 +2123,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * the active context to interject the preemption request, * i.e. we will retrigger preemption following the ack in case * of trouble. - */ - active = READ_ONCE(execlists->active); - - /* + * * In theory we can skip over completed contexts that have not * yet been processed by events (as those events are in flight): * @@ -2131,9 +2134,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * find itself trying to jump back into a context it has just * completed and barf. */ - if ((last = *active)) { - if (need_preempt(engine, last, rb)) { + if (need_preempt(engine, last, ve)) { if (i915_request_completed(last)) { tasklet_hi_schedule(&execlists->tasklet); return; @@ -2164,7 +2166,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) __unwind_incomplete_requests(engine); last = NULL; - } else if (need_timeslice(engine, last, rb) && + } else if (need_timeslice(engine, last, ve) && timeslice_expired(execlists, last)) { if (i915_request_completed(last)) { tasklet_hi_schedule(&execlists->tasklet); @@ -2218,110 +2220,96 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } } - while (rb) { /* XXX virtual is always taking precedence */ - struct virtual_engine *ve = - rb_entry(rb, typeof(*ve), nodes[engine->id].rb); + while (ve) { /* XXX virtual is always taking precedence */ struct i915_request *rq; spin_lock(&ve->base.active.lock); rq = ve->request; - if (unlikely(!rq)) { /* lost the race to a sibling */ - spin_unlock(&ve->base.active.lock); - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); - rb = rb_first_cached(&execlists->virtual); - continue; - } + if (unlikely(!rq)) /* lost the race to a sibling */ + goto unlock; GEM_BUG_ON(rq != ve->request); GEM_BUG_ON(rq->engine != &ve->base); GEM_BUG_ON(rq->context != &ve->context); - if (rq_prio(rq) >= queue_prio(execlists)) { - if (!virtual_matches(ve, rq, engine)) { - spin_unlock(&ve->base.active.lock); - rb = rb_next(rb); - continue; - } + if (unlikely(rq_prio(rq) < queue_prio(execlists))) { + spin_unlock(&ve->base.active.lock); + break; + } - if (last && !can_merge_rq(last, rq)) { - spin_unlock(&ve->base.active.lock); - start_timeslice(engine, rq_prio(rq)); - return; /* leave this for another sibling */ - } + GEM_BUG_ON(!virtual_matches(ve, rq, engine)); - ENGINE_TRACE(engine, - "virtual rq=%llx:%lld%s, new engine? %s\n", - rq->fence.context, - rq->fence.seqno, - i915_request_completed(rq) ? "!" : - i915_request_started(rq) ? "*" : - "", - yesno(engine != ve->siblings[0])); - - WRITE_ONCE(ve->request, NULL); - WRITE_ONCE(ve->base.execlists.queue_priority_hint, - INT_MIN); - rb_erase_cached(rb, &execlists->virtual); - RB_CLEAR_NODE(rb); + if (last && !can_merge_rq(last, rq)) { + spin_unlock(&ve->base.active.lock); + start_timeslice(engine, rq_prio(rq)); + return; /* leave this for another sibling */ + } - GEM_BUG_ON(!(rq->execution_mask & engine->mask)); - WRITE_ONCE(rq->engine, engine); + ENGINE_TRACE(engine, + "virtual rq=%llx:%lld%s, new engine? %s\n", + rq->fence.context, + rq->fence.seqno, + i915_request_completed(rq) ? "!" : + i915_request_started(rq) ? "*" : + "", + yesno(engine != ve->siblings[0])); - if (engine != ve->siblings[0]) { - u32 *regs = ve->context.lrc_reg_state; - unsigned int n; + WRITE_ONCE(ve->request, NULL); + WRITE_ONCE(ve->base.execlists.queue_priority_hint, INT_MIN); - GEM_BUG_ON(READ_ONCE(ve->context.inflight)); + rb = &ve->nodes[engine->id].rb; + rb_erase_cached(rb, &execlists->virtual); + RB_CLEAR_NODE(rb); - if (!intel_engine_has_relative_mmio(engine)) - virtual_update_register_offsets(regs, - engine); + GEM_BUG_ON(!(rq->execution_mask & engine->mask)); + WRITE_ONCE(rq->engine, engine); - if (!list_empty(&ve->context.signals)) - virtual_xfer_breadcrumbs(ve); + if (engine != ve->siblings[0]) { + u32 *regs = ve->context.lrc_reg_state; + unsigned int n; - /* - * Move the bound engine to the top of the list - * for future execution. We then kick this - * tasklet first before checking others, so that - * we preferentially reuse this set of bound - * registers. - */ - for (n = 1; n < ve->num_siblings; n++) { - if (ve->siblings[n] == engine) { - swap(ve->siblings[n], - ve->siblings[0]); - break; - } - } + GEM_BUG_ON(READ_ONCE(ve->context.inflight)); - GEM_BUG_ON(ve->siblings[0] != engine); - } + if (!intel_engine_has_relative_mmio(engine)) + virtual_update_register_offsets(regs, engine); - if (__i915_request_submit(rq)) { - submit = true; - last = rq; - } - i915_request_put(rq); + if (!list_empty(&ve->context.signals)) + virtual_xfer_breadcrumbs(ve); /* - * Hmm, we have a bunch of virtual engine requests, - * but the first one was already completed (thanks - * preempt-to-busy!). Keep looking at the veng queue - * until we have no more relevant requests (i.e. - * the normal submit queue has higher priority). + * Move the bound engine to the top of the list for + * future execution. We then kick this tasklet first + * before checking others, so that we preferentially + * reuse this set of bound registers. */ - if (!submit) { - spin_unlock(&ve->base.active.lock); - rb = rb_first_cached(&execlists->virtual); - continue; + for (n = 1; n < ve->num_siblings; n++) { + if (ve->siblings[n] == engine) { + swap(ve->siblings[n], ve->siblings[0]); + break; + } } + + GEM_BUG_ON(ve->siblings[0] != engine); + } + + if (__i915_request_submit(rq)) { + submit = true; + last = rq; } + i915_request_put(rq); +unlock: spin_unlock(&ve->base.active.lock); - break; + + /* + * Hmm, we have a bunch of virtual engine requests, + * but the first one was already completed (thanks + * preempt-to-busy!). Keep looking at the veng queue + * until we have no more relevant requests (i.e. + * the normal submit queue has higher priority). + */ + ve = submit ? NULL : first_virtual_engine(engine); } while ((rb = rb_first_cached(&execlists->queue))) { From patchwork Mon Jun 15 12:39:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604749 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D7F490 for ; Mon, 15 Jun 2020 12:39:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 75A18206D7 for ; Mon, 15 Jun 2020 12:39:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 75A18206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A44586E10F; Mon, 15 Jun 2020 12:39:31 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 270F56E2F0 for ; Mon, 15 Jun 2020 12:39:29 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501018-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:19 +0100 Message-Id: <20200615123920.17749-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/10] drm/i915/gt: Decouple inflight virtual engines X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Once a virtual engine has been bound to a sibling, it will remain bound until we finally schedule out the last active request. We can not rebind the context to a new sibling while it is inflight as the context save will conflict, hence we wait. As we cannot then use any other sibliing while the context is inflight, only kick the bound sibling while it inflight and upon scheduling out the kick the rest (so that we can swap engines on timeslicing if the previously bound engine becomes oversubscribed). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 30 +++++++++++++---------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 34a8eadc2de3..4a1e23e51314 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1409,9 +1409,8 @@ execlists_schedule_in(struct i915_request *rq, int idx) static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); - struct i915_request *next = READ_ONCE(ve->request); - if (next == rq || (next && next->execution_mask & ~rq->execution_mask)) + if (READ_ONCE(ve->request)) tasklet_hi_schedule(&ve->base.execlists.tasklet); } @@ -1825,18 +1824,14 @@ first_virtual_engine(struct intel_engine_cs *engine) rb_entry(rb, typeof(*ve), nodes[engine->id].rb); struct i915_request *rq = READ_ONCE(ve->request); - if (!rq) { /* lazily cleanup after another engine handled rq */ + /* lazily cleanup after another engine handled rq */ + if (!rq || !virtual_matches(ve, rq, engine)) { rb_erase_cached(rb, &el->virtual); RB_CLEAR_NODE(rb); rb = rb_first_cached(&el->virtual); continue; } - if (!virtual_matches(ve, rq, engine)) { - rb = rb_next(rb); - continue; - } - return ve; } @@ -5486,7 +5481,6 @@ static void virtual_submission_tasklet(unsigned long data) if (unlikely(!mask)) return; - local_irq_disable(); for (n = 0; n < ve->num_siblings; n++) { struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]); struct ve_node * const node = &ve->nodes[sibling->id]; @@ -5496,20 +5490,19 @@ static void virtual_submission_tasklet(unsigned long data) if (!READ_ONCE(ve->request)) break; /* already handled by a sibling's tasklet */ + spin_lock_irq(&sibling->active.lock); + if (unlikely(!(mask & sibling->mask))) { if (!RB_EMPTY_NODE(&node->rb)) { - spin_lock(&sibling->active.lock); rb_erase_cached(&node->rb, &sibling->execlists.virtual); RB_CLEAR_NODE(&node->rb); - spin_unlock(&sibling->active.lock); } - continue; - } - spin_lock(&sibling->active.lock); + goto unlock_engine; + } - if (!RB_EMPTY_NODE(&node->rb)) { + if (unlikely(!RB_EMPTY_NODE(&node->rb))) { /* * Cheat and avoid rebalancing the tree if we can * reuse this node in situ. @@ -5549,9 +5542,12 @@ static void virtual_submission_tasklet(unsigned long data) if (first && prio > sibling->execlists.queue_priority_hint) tasklet_hi_schedule(&sibling->execlists.tasklet); - spin_unlock(&sibling->active.lock); +unlock_engine: + spin_unlock_irq(&sibling->active.lock); + + if (intel_context_inflight(&ve->context)) + break; } - local_irq_enable(); } static void virtual_submit_request(struct i915_request *rq) From patchwork Mon Jun 15 12:39:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11604769 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1C9192A for ; Mon, 15 Jun 2020 12:40:01 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C9BA7206D7 for ; Mon, 15 Jun 2020 12:40:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C9BA7206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 28D866E317; Mon, 15 Jun 2020 12:40:01 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D630E6E317 for ; Mon, 15 Jun 2020 12:39:58 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21501019-1500050 for multiple; Mon, 15 Jun 2020 13:39:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 15 Jun 2020 13:39:20 +0100 Message-Id: <20200615123920.17749-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200615123920.17749-1-chris@chris-wilson.co.uk> References: <20200615123920.17749-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/10] drm/i915/gt: Resubmit the virtual engine on schedule-out X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Having recognised that we do not change the sibling until we schedule out, we can then defer the decision to resubmit the virtual engine from the unwind of the active queue to scheduling out of the virtual context. By keeping the unwind order intact on the local engine, we can preserve data dependency ordering while doing a preempt-to-busy pass until we have determined the new ELSP. This means that if we try to timeslice between a virtual engine and a data-dependent ordinary request, the pair will maintain their relative ordering and we will avoid the resubmission, cancelling the timeslicing until further change. The dilemma though is that we then may end up in a situation where the 'demotion' of the virtual request to an ordinary request in the engine queue results in filling the ELSP[] with virtual requests instead of spreading the load across the engines. To compensate for this, we mark each virtual request and refuse to resubmit a virtual request in the secondary ELSP slots, thus forcing subsequent virtual requests to be scheduled out after timeslicing. By delaying the decision until we schedule out, we will avoid unnecessary resubmission. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 118 ++++++++++++++++--------- drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 +- 2 files changed, 75 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 4a1e23e51314..45029eb86b6e 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1117,53 +1117,23 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) __i915_request_unsubmit(rq); - /* - * Push the request back into the queue for later resubmission. - * If this request is not native to this physical engine (i.e. - * it came from a virtual source), push it back onto the virtual - * engine so that it can be moved across onto another physical - * engine as load dictates. - */ - if (likely(rq->execution_mask == engine->mask)) { - GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); - if (rq_prio(rq) != prio) { - prio = rq_prio(rq); - pl = i915_sched_lookup_priolist(engine, prio); - } - GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - - list_move(&rq->sched.link, pl); - set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); + GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); + if (rq_prio(rq) != prio) { + prio = rq_prio(rq); + pl = i915_sched_lookup_priolist(engine, prio); + } + GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - /* Check in case we rollback so far we wrap [size/2] */ - if (intel_ring_direction(rq->ring, - intel_ring_wrap(rq->ring, - rq->tail), - rq->ring->tail) > 0) - rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE; + list_move(&rq->sched.link, pl); + set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); - active = rq; - } else { - struct intel_engine_cs *owner = rq->context->engine; + /* Check in case we rollback so far we wrap [size/2] */ + if (intel_ring_direction(rq->ring, + intel_ring_wrap(rq->ring, rq->tail), + rq->ring->tail) > 0) + rq->context->lrc.desc |= CTX_DESC_FORCE_RESTORE; - /* - * Decouple the virtual breadcrumb before moving it - * back to the virtual engine -- we don't want the - * request to complete in the background and try - * and cancel the breadcrumb on the virtual engine - * (instead of the old engine where it is linked)! - */ - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - &rq->fence.flags)) { - spin_lock_nested(&rq->lock, - SINGLE_DEPTH_NESTING); - i915_request_cancel_breadcrumb(rq); - spin_unlock(&rq->lock); - } - WRITE_ONCE(rq->engine, owner); - owner->submit_request(rq); - active = NULL; - } + active = rq; } return active; @@ -1406,12 +1376,49 @@ execlists_schedule_in(struct i915_request *rq, int idx) return i915_request_get(rq); } +static void +resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve) +{ + struct intel_engine_cs *engine = rq->engine; + unsigned long flags; + + spin_lock_irqsave(&engine->active.lock, flags); + + /* + * Decouple the virtual breadcrumb before moving it back to the virtual + * engine -- we don't want the request to complete in the background + * and then try and cancel the breadcrumb on the virtual engine + * (instead of the old engine where it is linked)! + */ + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) { + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING); + i915_request_cancel_breadcrumb(rq); + spin_unlock(&rq->lock); + } + + clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags); + WRITE_ONCE(rq->engine, &ve->base); + ve->base.submit_request(rq); + + spin_unlock_irqrestore(&engine->active.lock, flags); +} + static void kick_siblings(struct i915_request *rq, struct intel_context *ce) { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); if (READ_ONCE(ve->request)) tasklet_hi_schedule(&ve->base.execlists.tasklet); + + /* + * This engine is now too busy to run this virtual request, so + * see if we can find an alternative engine for it to execute on. + * Once a request has become bonded to this engine, we treat it the + * same as other native request. + */ + if (i915_request_in_priority_queue(rq) && + rq->execution_mask != rq->engine->mask) + resubmit_virtual_request(rq, ve); } static inline void __execlists_schedule_out(struct i915_request *rq) @@ -1650,6 +1657,20 @@ assert_pending_valid(const struct intel_engine_execlists *execlists, } sentinel = i915_request_has_sentinel(rq); + /* + * We want virtual requests to only be in the first slot so + * that they are never stuck behind a hog and can be immediately + * transferred onto the next idle engine. + */ + if (rq->execution_mask != engine->mask && + port != execlists->pending) { + GEM_TRACE_ERR("%s: virtual engine:%llx not in prime position[%zd]\n", + engine->name, + ce->timeline->fence_context, + port - execlists->pending); + return false; + } + /* Hold tightly onto the lock to prevent concurrent retires! */ if (!spin_trylock_irqsave(&rq->lock, flags)) continue; @@ -2346,6 +2367,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (i915_request_has_sentinel(last)) goto done; + /* + * We avoid submitting virtual requests into + * the secondary ports so that we can migrate + * the request immediately to another engine + * rather than wait for the primary request. + */ + if (rq->execution_mask != engine->mask) + goto done; + /* * If GVT overrides us we only ever submit * port[0], leaving port[1] empty. Note that we diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 91543494f595..a8d62faa3a88 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4289,7 +4289,7 @@ static int reset_virtual_engine(struct intel_gt *gt, spin_lock_irq(&engine->active.lock); __unwind_incomplete_requests(engine); spin_unlock_irq(&engine->active.lock); - GEM_BUG_ON(rq->engine != ve->engine); + GEM_BUG_ON(rq->engine != engine); /* Reset the engine while keeping our active request on hold */ execlists_hold(engine, rq);